* The code for estimating the impact of ALcohol ban on IPV (24 June 2024)
* Open the practise file 
* We standardize outcome variables separately for baseline and endline datasets 
* for baseline observations 
* mean and sd of outcome variables for control group (Jharkhand) at baseline 

summ emo1 emo2 emo3 phy1 phy2 phy3 phy4 phy5 phy6  phy7 sexual1 sexual2 sexual3 control1 control2 control3 control4 control5 control6 if bihar == 0 & year==0 


    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
        emo1 |      2,524    .0412044    .1988021          0          1
        emo2 |      2,524    .0388273    .1932214          0          1
        emo3 |      2,524    .0348653    .1834749          0          1
        phy1 |      2,524     .086767    .2815492          0          1
        phy2 |      2,524    .2044374    .4033698          0          1
-------------+---------------------------------------------------------
        phy3 |      2,524    .0736926    .2613217          0          1
        phy4 |      2,524    .0526941    .2234664          0          1
        phy5 |      2,524    .0110935    .1047606          0          1
        phy6 |      2,524    .0055468    .0742844          0          1
        phy7 |      2,524     .088748     .284436          0          1
-------------+---------------------------------------------------------
     sexual1 |      2,524    .0455626    .2085758          0          1
     sexual2 |      2,524    .0178288    .1323553          0          1
     sexual3 |      2,524    .0289223    .1676215          0          1
    control1 |      2,522    .2664552    .4421927          0          1
    control2 |      2,523    .0757035    .2645756          0          1
-------------+---------------------------------------------------------
    control3 |      2,521    .3201111    .4666115          0          1
    control4 |      2,520     .168254    .3741659          0          1
    control5 |      2,521    .2427608    .4288367          0          1
    control6 |      2,519    .4096864    .4918735          0          1


* standardize all outcome variables for all observations at baseline (bihar and Jharkhand)

gen emo_1_norm = (emo1 - .0412044)/.1988021 if year==0
gen emo_2_norm = (emo2 - .0388273)/.1932214 if year==0
gen emo_3_norm = (emo3 - .0348653)/.1834749 if year==0

gen phy_1_norm = (phy1 - .086767)/.2815492 if year==0
gen phy_2_norm = (phy2 - .2044374  )/.4033698 if year==0
gen phy_3_norm = (phy3 - .0736926 )/.2613217 if year==0
gen phy_4_norm = (phy4 - .0526941)/.2234664 if year==0
gen phy_5_norm = (phy5 - .0110935)/.1047606 if year==0
gen phy_6_norm = (phy6 - .0055468 )/.0742844 if year==0
gen phy_7_norm = (phy7 - .088748 )/.284436 if year==0

gen sexual_1_norm = (sexual1 - .0455626 )/.2085758 if year==0
gen sexual_2_norm = (sexual2 - .0178288 )/.1323553 if year==0 
gen sexual_3_norm = (sexual3 - .0289223)/.1676215 if year==0  

gen control_1_norm = (control1 - .2664552 )/.4421927 if year==0
gen control_2_norm = (control2 - .0757035)/.2645756 if year==0 
gen control_3_norm = (control3 - .3201111  )/.4666115 if year==0
gen control_4_norm = (control4 - .168254 )/.3741659 if year==0
gen control_5_norm = (control5 - .2427608)/.4288367 if year==0 
gen control_6_norm = (control6 -  .4096864)/.4918735 if year==0

* for endline observations 
* mean and sd of outcome varibales for control group (Jharkhand) at endline
summ emo1 emo2 emo3 phy1 phy2 phy3 phy4 phy5 phy6  phy7 sexual1 sexual2 sexual3 control1 control2 control3 control4 control5 control6 if bihar == 0 & year==1

    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
        emo1 |      2,339    .0726806    .2596671          0          1
        emo2 |      2,339    .0662676    .2488025          0          1
        emo3 |      2,339    .0692604    .2539507          0          1
        phy1 |      2,339    .1308251    .3372811          0          1
        phy2 |      2,339    .2501069     .433167          0          1
-------------+---------------------------------------------------------
        phy3 |      2,339    .0850791     .279059          0          1
        phy4 |      2,339    .0739632     .261767          0          1
        phy5 |      2,339    .0269346    .1619269          0          1
        phy6 |      2,339     .025652    .1581286          0          1
        phy7 |      2,339    .1269773    .3330188          0          1
-------------+---------------------------------------------------------
     sexual1 |      2,339    .0474562     .212658          0          1
     sexual2 |      2,339    .0376229    .1903232          0          1
     sexual3 |      2,339     .045746    .2089785          0          1
    control1 |      2,338    .3485885    .4766254          0          1
    control2 |      2,339    .1124412    .3159761          0          1
-------------+---------------------------------------------------------
    control3 |      2,338    .2617622    .4396879          0          1
    control4 |      2,337    .1852803    .3886079          0          1
    control5 |      2,339    .2988457    .4578499          0          1
    control6 |      2,338    .3195038    .4663841          0          1

* standardize all outcome variables for all observations at endline (bihar and Jharkhand)

replace emo_1_norm = (emo1 - .0726806)/.2596671 if year==1
replace emo_2_norm = (emo2 - .0662676)/.2488025 if year==1
replace emo_3_norm = (emo3 - .0692604)/.2539507  if year==1

replace phy_1_norm = (phy1 - .1308251)/.3372811  if year==1
replace phy_2_norm = (phy2 - .2501069  )/.433167  if year==1
replace phy_3_norm = (phy3 - .0850791  )/.279059 if year==1
replace phy_4_norm = (phy4 - .0739632)/.261767 if year==1
replace phy_5_norm = (phy5 - .0269346)/.1619269  if year==1
replace phy_6_norm = (phy6 - .025652  )/.1581286  if year==1
replace phy_7_norm = (phy7 - .1269773 )/.3330188 if year==1

replace sexual_1_norm = (sexual1 - .0474562 )/.212658 if year==1
replace sexual_2_norm = (sexual2 - .0376229 )/.1903232 if year==1 
replace sexual_3_norm = (sexual3 - .045746)/.2089785  if year==1 

replace control_1_norm = (control1 - .3485885  )/.4766254 if year==1
replace control_2_norm = (control2 - .1124412)/.3159761 if year==1 
replace control_3_norm = (control3 - .2617622 )/.4396879 if year==1
replace control_4_norm = (control4 - .1852803 )/.3886079  if year==1
replace control_5_norm = (control5 - .2988457)/.4578499 if year==1 
replace control_6_norm = (control6 -  .3195038 )/.4663841 if year==1


gen emo_sum = emo_1_norm + emo_2_norm + emo_3_norm
gen phy_sum = phy_1_norm + phy_2_norm + phy_3_norm + phy_4_norm + phy_5_norm + phy_6_norm + phy_7_norm
gen sexual_sum = sexual_1_norm + sexual_2_norm + sexual_3_norm
gen control_sum = control_1_norm + control_2_norm + control_3_norm + control_4_norm + control_5_norm + control_6_norm
gen ipv_sum = emo_1_norm + emo_2_norm + emo_3_norm + phy_1_norm + phy_2_norm + phy_3_norm + phy_4_norm + phy_5_norm + phy_6_norm + phy_7_norm + sexual_1_norm + sexual_2_norm + sexual_3_norm + control_1_norm + control_2_norm + control_3_norm + control_4_norm + control_5_norm + control_6_norm


* we standardize the outcome index again separately for baseline and endline datasets 
* for baseline observations 
* mean and sd of outcome index for control group (Jharkhand) at baseline 

sum emo_sum phy_sum sexual_sum control_sum ipv_sum if bihar == 0 & year == 0 


    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
     emo_sum |      2,524   -6.71e-08    2.405822  -.5982382   15.05764
     phy_sum |      2,524   -4.89e-07    4.679791  -1.825381   39.03024
  sexual_sum |      2,524    5.85e-07    2.457897  -.5256956   17.78997
 control_sum |      2,512    .0011103    3.692974  -3.423421   11.79833
     ipv_sum |      2,512   -.0211003    9.186726  -6.372736   80.16044


* we standardize the index variables for all observations at baseline (bihar and Jharkhand)

gen emo_index_norm     = (emo_sum -(-6.71e-08))/2.405822  if year==0
gen phy_index_norm     = (phy_sum - (-4.89e-07))/4.679791  if year==0
gen sexual_index_norm  = (sexual_sum - (5.85e-07))/2.457897  if year==0
gen control_index_norm = (control_sum - (.0011103))/3.692974  if year==0
gen ipv_index_norm     = (ipv_sum - (-.0211003))/9.186726  if year==0   	 


* for endline observations 
* mean and sd of outcome index for control group (Jharkhand) at endline 
sum emo_sum phy_sum sexual_sum control_sum ipv_sum if bihar == 0 & year == 1


    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
     emo_sum |      2,339    1.64e-07    2.550143   -.818977   10.98913
     phy_sum |      2,339    1.86e-07    5.122912  -2.262557     25.917
  sexual_sum |      2,339    1.81e-07    2.704619  -.6397393   14.10205
 control_sum |      2,335   -.0049836    3.901321  -3.497118   10.94167
     ipv_sum |      2,335   -.0008831    11.17254  -7.218392   61.94984


* we standardize the index variables for all observations at endline (bihar and Jharkhand)

replace emo_index_norm     = (emo_sum -(1.64e-07))/2.550143  if year==1
replace phy_index_norm     = (phy_sum - (1.86e-07))/5.122912  if year==1
replace sexual_index_norm  = (sexual_sum - (1.81e-07))/2.704619  if year==1
replace control_index_norm = (control_sum - (-.0049836))/3.901321  if year==1
replace ipv_index_norm     = (ipv_sum - (-.0008831))/11.17254  if year==1   	 


* we have standardized all outcome variables .
* save this file as master file

* Now we execute the algorithmic approach to include control variables 
* we perform the algorithm to include controls only in the baseline dataset
* we only check for the control variables identified in baseline dataset for multicollinearity in endline dataset 

* to identify control variables to be included we only keep the baseline dataset 
* we first create separate baseline and endline datasets. 

* open baseline dataset  

* we create an empty model and then apply stepwise command to add the controls (5% significance)
* we only consider controls that are not likely to be affected by the policy 
* we do this step separately for treatment indicator and outcome variable


* with treatment variable 

* stepwise forward 
stepwise, lr  pe(0.05): logit bihar  resp_age resp_edu (i.religion) (i.caste) husb_edu year_marr hh_size hhd_age (i.hhd_gender) (i.place_residence) (i.wealth_index), or
 
*or at the last gives likelihood ratios 

* result below 

note: 0b.religion omitted because of estimability.
note: 0b.caste omitted because of estimability.
note: 0b.hhd_gender omitted because of estimability.
note: 0b.place_residence omitted because of estimability.
note: 1b.wealth_index omitted because of estimability.

LR test, begin with empty model:
p = 0.0000 <  0.0500, adding 1.caste
p = 0.0000 <  0.0500, adding 1.hhd_gender
p = 0.0000 <  0.0500, adding 1.religion
p = 0.0000 <  0.0500, adding 2.wealth_index 3.wealth_index 4.wealth_index 5.wealth_index
p = 0.0000 <  0.0500, adding hh_size
p = 0.0000 <  0.0500, adding resp_edu
p = 0.0000 <  0.0500, adding 1.place_residence
p = 0.0112 <  0.0500, adding year_marr
p = 0.0052 <  0.0500, adding resp_age

Logistic regression                                     Number of obs =  6,413
                                                        LR chi2(12)   = 893.67
                                                        Prob > chi2   = 0.0000
Log likelihood = -3846.6989                             Pseudo R2     = 0.1041

-----------------------------------------------------------------------------------
            bihar | Odds ratio   Std. err.      z    P>|z|     [95% conf. interval]
------------------+----------------------------------------------------------------
          1.caste |   .3325877   .0205238   -17.84   0.000     .2946992    .3753474
     1.hhd_gender |   2.765767   .2162903    13.01   0.000     2.372736    3.223902
       1.religion |   2.374793   .1657278    12.39   0.000     2.071207    2.722876
                  |
     wealth_index |
               2  |   .9455646   .0699817    -0.76   0.449     .8178872    1.093173
               3  |   1.063646   .1007941     0.65   0.515     .8833539    1.280735
               4  |   1.192451   .1439666     1.46   0.145     .9411823    1.510802
               5  |   .5316116   .0840683    -4.00   0.000       .38993    .7247734
                  |
          hh_size |    1.10275   .0141957     7.60   0.000     1.075275    1.130927
         resp_edu |   .9492872   .0069457    -7.11   0.000     .9357711    .9629986
1.place_residence |   .6774268   .0566804    -4.65   0.000      .574966    .7981465
        year_marr |   .9702091   .0081919    -3.58   0.000     .9542853    .9863985
         resp_age |   1.024805   .0090149     2.79   0.005     1.007287    1.042627
            _cons |   .5215914   .1106659    -3.07   0.002     .3441368    .7905506
-----------------------------------------------------------------------------------
Note: _cons estimates baseline odds.

* we get 9 controls included in forward stepwise algorithm 

 we also conduct a stepwise backward estimation by taking all controls and dropping them
* for treatment variable  

stepwise, lr  pr(0.05): logit bihar  resp_age resp_edu (i.religion) (i.caste) husb_edu year_marr hh_size hhd_age (i.hhd_gender) (i.place_residence) (i.wealth_index), or


note: 0b.religion omitted because of estimability.
note: 0b.caste omitted because of estimability.
note: 0b.hhd_gender omitted because of estimability.
note: 0b.place_residence omitted because of estimability.
note: 1b.wealth_index omitted because of estimability.

LR test, begin with full model:
p = 0.2676 >= 0.0500, removing husb_edu
p = 0.2138 >= 0.0500, removing hhd_age

Logistic regression                                     Number of obs =  6,413
                                                        LR chi2(12)   = 893.67
                                                        Prob > chi2   = 0.0000
Log likelihood = -3846.6989                             Pseudo R2     = 0.1041

-----------------------------------------------------------------------------------
            bihar | Odds ratio   Std. err.      z    P>|z|     [95% conf. interval]
------------------+----------------------------------------------------------------
         resp_age |   1.024805   .0090149     2.79   0.005     1.007287    1.042627
         resp_edu |   .9492872   .0069457    -7.11   0.000     .9357711    .9629986
       1.religion |   2.374793   .1657278    12.39   0.000     2.071207    2.722876
          1.caste |   .3325877   .0205238   -17.84   0.000     .2946992    .3753474
                  |
     wealth_index |
               2  |   .9455646   .0699817    -0.76   0.449     .8178872    1.093173
               3  |   1.063646   .1007941     0.65   0.515     .8833539    1.280735
               4  |   1.192451   .1439666     1.46   0.145     .9411823    1.510802
               5  |   .5316116   .0840683    -4.00   0.000       .38993    .7247734
                  |
        year_marr |   .9702091   .0081919    -3.58   0.000     .9542853    .9863985
          hh_size |    1.10275   .0141957     7.60   0.000     1.075275    1.130927
1.place_residence |   .6774268   .0566804    -4.65   0.000      .574966    .7981465
     1.hhd_gender |   2.765767   .2162903    13.01   0.000     2.372736    3.223902
            _cons |   .5215914   .1106659    -3.07   0.002     .3441368    .7905506
-----------------------------------------------------------------------------------
Note: _cons estimates baseline odds.

* we get 9 controls backward stepwise algorithm  
* we get the same set of control variables in forward and backward stepwise regression on treatment variable 
* we therefore get caste, hhd_gender, religion , wealth_index, resp_edu, hh_size, place_residence, year_marr and resp_age as linear controls (in the baseline model)


* We now run the stepwise control addition (both forward and backward) on outcome variable(ipv_index_norm)

* stepwise forward 
stepwise,   pe(0.05): regress ipv_index_norm  resp_age resp_edu (i.religion) (i.caste) husb_edu year_marr hh_size hhd_age (i.hhd_gender) (i.place_residence) (i.wealth_index)

note: 0b.religion omitted because of estimability.
note: 0b.caste omitted because of estimability.
note: 0b.hhd_gender omitted because of estimability.
note: 0b.place_residence omitted because of estimability.
note: 1b.wealth_index omitted because of estimability.

Wald test, begin with empty model:
p = 0.0000 <  0.0500, adding resp_edu
p = 0.0000 <  0.0500, adding resp_age
p = 0.0001 <  0.0500, adding 1.religion
p = 0.0000 <  0.0500, adding husb_edu
p = 0.0056 <  0.0500, adding 1.hhd_gender
p = 0.0088 <  0.0500, adding 2.wealth_index 3.wealth_index 4.wealth_index 5.wealth_index
p = 0.0199 <  0.0500, adding hh_size
p = 0.0336 <  0.0500, adding hhd_age

      Source |       SS           df       MS      Number of obs   =     6,358
-------------+----------------------------------   F(11, 6346)     =     26.85
       Model |  543.161664        11  49.3783331   Prob > F        =    0.0000
    Residual |  11669.3835     6,346  1.83885653   R-squared       =    0.0445
-------------+----------------------------------   Adj R-squared   =    0.0428
       Total |  12212.5452     6,357   1.9211177   Root MSE        =     1.356

------------------------------------------------------------------------------
ipv_index_~m | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
    resp_edu |  -.0320266   .0050235    -6.38   0.000    -.0418744   -.0221787
    resp_age |  -.0097588   .0023112    -4.22   0.000    -.0142896    -.005228
  1.religion |   .1920186   .0434235     4.42   0.000     .1068939    .2771433
    husb_edu |  -.0142991   .0044661    -3.20   0.001    -.0230542   -.0055441
1.hhd_gender |   .1234272   .0442124     2.79   0.005     .0367559    .2100984
             |
wealth_index |
          2  |  -.0090012    .044933    -0.20   0.841    -.0970852    .0790827
          3  |  -.1426471   .0571461    -2.50   0.013    -.2546727   -.0306214
          4  |  -.0714431   .0714157    -1.00   0.317     -.211442    .0685559
          5  |  -.2755518   .0909534    -3.03   0.002    -.4538511   -.0972524
             |
     hh_size |   .0237926   .0080845     2.94   0.003     .0079442     .039641
     hhd_age |  -.0029077   .0013679    -2.13   0.034    -.0055893   -.0002262
       _cons |   .7634356    .100919     7.56   0.000     .5656003     .961271
------------------------------------------------------------------------------

* we get 8 control variables in forward regression on outcome variable(ipv_index_norm)


* stepwise backward 
stepwise,   pr(0.05): regress ipv_index_norm  resp_age resp_edu (i.religion) (i.caste) husb_edu year_marr hh_size hhd_age (i.hhd_gender) (i.place_residence) (i.wealth_index)

note: 0b.religion omitted because of estimability.
note: 0b.caste omitted because of estimability.
note: 0b.hhd_gender omitted because of estimability.
note: 0b.place_residence omitted because of estimability.
note: 1b.wealth_index omitted because of estimability.

Wald test, begin with full model:
p = 0.8459 >= 0.0500, removing 1.caste
p = 0.3063 >= 0.0500, removing 1.place_residence
p = 0.0922 >= 0.0500, removing year_marr

      Source |       SS           df       MS      Number of obs   =     6,358
-------------+----------------------------------   F(11, 6346)     =     26.85
       Model |  543.161664        11  49.3783331   Prob > F        =    0.0000
    Residual |  11669.3835     6,346  1.83885653   R-squared       =    0.0445
-------------+----------------------------------   Adj R-squared   =    0.0428
       Total |  12212.5452     6,357   1.9211177   Root MSE        =     1.356

------------------------------------------------------------------------------
ipv_index_~m | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
    resp_age |  -.0097588   .0023112    -4.22   0.000    -.0142896    -.005228
    resp_edu |  -.0320266   .0050235    -6.38   0.000    -.0418744   -.0221787
  1.religion |   .1920186   .0434235     4.42   0.000     .1068939    .2771433
             |
wealth_index |
          2  |  -.0090012    .044933    -0.20   0.841    -.0970852    .0790827
          3  |  -.1426471   .0571461    -2.50   0.013    -.2546727   -.0306214
          4  |  -.0714431   .0714157    -1.00   0.317     -.211442    .0685559
          5  |  -.2755518   .0909534    -3.03   0.002    -.4538511   -.0972524
             |
    husb_edu |  -.0142991   .0044661    -3.20   0.001    -.0230542   -.0055441
1.hhd_gender |   .1234272   .0442124     2.79   0.005     .0367559    .2100984
     hh_size |   .0237926   .0080845     2.94   0.003     .0079442     .039641
     hhd_age |  -.0029077   .0013679    -2.13   0.034    -.0055893   -.0002262
       _cons |   .7634356    .100919     7.56   0.000     .5656003     .961271
------------------------------------------------------------------------------

* we get 8 control variables in backward regression on outcome variable(ipv_index_norm)
* we get the same set of control variables in forward and backward stepwise regression on outcome variable 
* we therefore get resp_age, resp_edu, religion , husb_edu, hhd_gender,wealth_index, hh_size and hhd_age as linear controls in forwards and backward regression on outcome variable 

* we now include variables that show signiicance in either of two sets of regression (treatment indicator and outcome variable)
* these are resp_age resp_edu religion wealth_index hhd_gender hh_size husb_edu hhd_age caste year_marr place_residence (11 linear controls)





* QUADARATIC AND INTERACTION TERM ADDITION 

* we now add quadratic and interaction varibales from the above selected linear variables (only those selected at the first stage)
* resp_age resp_edu religion wealth_index hhd_gender hh_size husb_edu hhd_age caste year_marr place_residence as linear controls
 
* we create quadratic and interaction term for each linear control variable

* we first create dummies for each categorical variable 

tabulate religion, generate(rel)
tabulate wealth_index, generate(wi)
tabulate hhd_gender, generate(hdgen)
tabulate caste, generate(cas)
tabulate place_residence, generate(pla)

* now we create interaction terms 
 
gen age_age = resp_age*resp_age
gen edu_age = resp_edu*resp_age
gen rel1_age =  (rel1)*resp_age
gen rel2_age =  (rel2)*resp_age
gen wi1_age = (wi1)*resp_age 
gen wi2_age = (wi2)*resp_age 
gen wi3_age = (wi3)*resp_age 
gen wi4_age = (wi4)*resp_age 
gen wi5_age = (wi5)*resp_age 
gen hdgen1_age = hdgen1*resp_age
gen hdgen2_age = hdgen2*resp_age
gen siz_age =  hh_size*resp_age 
gen hedu_age = husb_edu*resp_age
gen hhdage_age = hhd_age*resp_age
gen cas1_age =  cas1*resp_age
gen cas2_age =  cas2*resp_age
gen yea_age =  year_marr*resp_age
gen pla1_age =  pla1*resp_age
gen pla2_age =  pla2*resp_age

gen edu_edu =  resp_edu*resp_edu 
gen rel1_edu =  rel1*resp_edu
gen rel2_edu =  rel2*resp_edu
gen wi1_edu = wi1*resp_edu
gen wi2_edu = wi2*resp_edu
gen wi3_edu = wi3*resp_edu
gen wi4_edu = wi4*resp_edu
gen wi5_edu = wi5*resp_edu
gen hdgen1_edu = hdgen1*resp_edu
gen hdgen2_edu = hdgen2*resp_edu
gen edu_siz =  resp_edu*hh_size 
gen hedu_edu = husb_edu*resp_edu
gen hhdage_edu = hhd_age*resp_edu
gen cas1_edu = cas1*resp_edu
gen cas2_edu = cas2*resp_edu
gen edu_yea = resp_edu*year_marr 
gen edu_pla1 = resp_edu*pla1
gen edu_pla2 = resp_edu*pla2


gen rel1_wi1 =  rel1*wi1
gen rel1_wi2 =  rel1*wi2
gen rel1_wi3 =  rel1*wi3
gen rel1_wi4 =  rel1*wi4
gen rel1_wi5 =  rel1*wi5
gen rel2_wi1 =  rel2*wi1
gen rel2_wi2 =  rel2*wi2
gen rel2_wi3 =  rel2*wi3
gen rel2_wi4 =  rel2*wi4
gen rel2_wi5 =  rel2*wi5
gen hdgen1_rel1 =  hdgen1*rel1
gen hdgen1_rel2 =  hdgen1*rel2
gen hdgen2_rel1 =  hdgen2*rel1
gen hdgen2_rel2 =  hdgen2*rel2
gen rel1_siz =  rel1*hh_size
gen rel2_siz =  rel2*hh_size
gen hedu_rel1 = husb_edu*rel1
gen hedu_rel2 = husb_edu*rel2
gen hhdage_rel1 = hhd_age*rel1
gen hhdage_rel2 = hhd_age*rel2
gen cas1_rel1 =  cas1*rel1
gen cas1_rel2 =  cas1*rel2
gen cas2_rel1 =  cas2*rel1
gen cas2_rel2 =  cas2*rel2
gen rel1_yea =  rel1*year_marr
gen rel2_yea =  rel2*year_marr
gen rel1_pla1 =  rel1*pla1
gen rel1_pla2 =  rel1*pla2
gen rel2_pla1 =  rel2*pla1
gen rel2_pla2 =  rel2*pla2



gen hdgen1_wi1 = hdgen1*wi1
gen hdgen1_wi2 = hdgen1*wi2
gen hdgen1_wi3 = hdgen1*wi3
gen hdgen1_wi4 = hdgen1*wi4
gen hdgen1_wi5 = hdgen1*wi5
gen hdgen2_wi1 = hdgen2*wi1
gen hdgen2_wi2 = hdgen2*wi2
gen hdgen2_wi3 = hdgen2*wi3
gen hdgen2_wi4 = hdgen2*wi4
gen hdgen2_wi5 = hdgen2*wi5
gen wi1_siz = wi1*hh_size
gen wi2_siz = wi2*hh_size
gen wi3_siz = wi3*hh_size
gen wi4_siz = wi4*hh_size
gen wi5_siz = wi5*hh_size
gen hedu_wi1 = husb_edu*wi1
gen hedu_wi2 = husb_edu*wi2
gen hedu_wi3 = husb_edu*wi3
gen hedu_wi4 = husb_edu*wi4
gen hedu_wi5 = husb_edu*wi5
gen hhdage_wi1 = hhd_age*wi1
gen hhdage_wi2 = hhd_age*wi2
gen hhdage_wi3 = hhd_age*wi3
gen hhdage_wi4 = hhd_age*wi4
gen hhdage_wi5 = hhd_age*wi5
gen cas1_wi1 = cas1*wi1
gen cas1_wi2 = cas1*wi2
gen cas1_wi3 = cas1*wi3
gen cas1_wi4 = cas1*wi4
gen cas1_wi5 = cas1*wi5
gen cas2_wi1 = cas2*wi1
gen cas2_wi2 = cas2*wi2
gen cas2_wi3 = cas2*wi3                       
gen cas2_wi4 = cas2*wi4
gen cas2_wi5 = cas2*wi5
gen wi1_yea = wi1*year_marr 
gen wi2_yea = wi2*year_marr 
gen wi3_yea = wi3*year_marr 
gen wi4_yea = wi4*year_marr 
gen wi5_yea = wi5*year_marr 
gen wi1_pla1 = wi1*pla1
gen wi1_pla2 = wi1*pla2
gen wi2_pla1 = wi2*pla1
gen wi2_pla2 = wi2*pla2
gen wi3_pla1 = wi3*pla1
gen wi3_pla2 = wi3*pla2
gen wi4_pla1 = wi4*pla1
gen wi4_pla2 = wi4*pla2
gen wi5_pla1 = wi5*pla1
gen wi5_pla2 = wi5*pla2


gen hdgen1_siz =  hdgen1*hh_size
gen hdgen2_siz =  hdgen2*hh_size
gen hdgen1_hedu = hdgen1*husb_edu
gen hdgen2_hedu = hdgen2*husb_edu
gen hdgen1_hhdage= hdgen1*hhd_age
gen hdgen2_hhdage= hdgen2*hhd_age
gen cas1_hdgen1 =  cas1*hdgen1
gen cas1_hdgen2 =  cas1*hdgen2
gen cas2_hdgen1 =  cas2*hdgen1
gen cas2_hdgen2 =  cas2*hdgen2
gen hdgen1_yea = hdgen1*year_marr 
gen hdgen2_yea = hdgen2*year_marr 
gen hdgen1_pla1 =  hdgen1*pla1
gen hdgen1_pla2 =  hdgen1*pla2
gen hdgen2_pla1 =  hdgen2*pla1
gen hdgen2_pla2 =  hdgen2*pla2




gen siz_siz =  hh_size*hh_size 
gen hedu_siz = husb_edu*hh_size
gen hhdage_size = hhd_age*hh_size
gen cas1_siz =  cas1*hh_size 
gen cas2_siz =  cas2*hh_size 
gen siz_yea =  hh_size*year_marr 
gen siz_pla1 =  hh_size*pla1
gen siz_pla2 =  hh_size*pla2


gen hedu_hedu = husb_edu*husb_edu
gen hedu_hhdage = husb_edu*hhd_age
gen hedu_cas1 = husb_edu*cas1
gen hedu_cas2 = husb_edu*cas2
gen hedu_yea = husb_edu*year_marr
gen hedu_pla1 = husb_edu*pla1
gen hedu_pla2 = husb_edu*pla2


gen hhdage_hhdage = hhd_age*hhd_age
gen hhdage_cas1 = hhd_age*cas1 
gen hhdage_cas2 = hhd_age*cas2
gen hhdage_yea = hhd_age*year_marr
gen hhdage_pla1 = hhd_age*pla1
gen hhdage_pla2 = hhd_age*pla2


gen cas1_yea =  cas1*year_marr 
gen cas2_yea =  cas2*year_marr 
gen cas1_pla1 = cas1*pla1
gen cas1_pla2 = cas1*pla2
gen cas2_pla1 = cas2*pla1
gen cas2_pla2 = cas2*pla2


gen yea_yea =  year_marr*year_marr
gen pla1_yea =  pla1*year_marr
gen pla2_yea =  pla2*year_marr


* all quadratic and interaction variables are gievn below 
* age_age edu_age rel1_age rel2_age wi1_age wi2_age wi3_age wi4_age wi5_age hdgen1_age hdgen2_age siz_age hedu_age hhdage_age cas1_age cas2_age yea_age pla1_age pla2_age edu_edu rel1_edu rel2_edu wi1_edu wi2_edu wi3_edu wi4_edu wi5_edu hdgen1_edu hdgen2_edu edu_siz hedu_edu hhdage_edu cas1_edu cas2_edu edu_yea edu_pla1 edu_pla2 rel1_wi1 rel1_wi2 rel1_wi3 rel1_wi4 rel1_wi5 rel2_wi1 rel2_wi2 rel2_wi3 rel2_wi4 rel2_wi5 hdgen1_rel1 hdgen1_rel2 hdgen2_rel1 hdgen2_rel2 rel1_siz rel2_siz hedu_rel1 hedu_rel2 hhdage_rel1 hhdage_rel2 cas1_rel1 cas1_rel2 cas2_rel1 cas2_rel2 rel1_yea rel2_yea rel1_pla1 rel1_pla2 rel2_pla1 rel2_pla2 hdgen1_wi1 hdgen1_wi2 hdgen1_wi3 hdgen1_wi4 hdgen1_wi5 hdgen2_wi1 hdgen2_wi2 hdgen2_wi3 hdgen2_wi4 hdgen2_wi5 wi1_siz wi2_siz wi3_siz wi4_siz wi5_siz hedu_wi1 hedu_wi2 hedu_wi3 hedu_wi4 hedu_wi5 hhdage_wi1 hhdage_wi2 hhdage_wi3 hhdage_wi4 hhdage_wi5 cas1_wi1 cas1_wi2 cas1_wi3 cas1_wi4 cas1_wi5 cas2_wi1 cas2_wi2 cas2_wi3 cas2_wi4 cas2_wi5 wi1_yea wi2_yea wi3_yea wi4_yea wi5_yea wi1_pla1 wi1_pla2 wi2_pla1 wi2_pla2 wi3_pla1 wi3_pla2 wi4_pla1 wi4_pla2 wi5_pla1 wi5_pla2 hdgen1_siz hdgen2_siz hdgen1_hedu hdgen2_hedu hdgen1_hhdage hdgen2_hhdage cas1_hdgen1 cas1_hdgen2 cas2_hdgen1 cas2_hdgen2 hdgen1_yea hdgen2_yea hdgen1_pla1 hdgen1_pla2 hdgen2_pla1 hdgen2_pla2 siz_siz hedu_siz hhdage_size cas1_siz cas2_siz siz_yea siz_pla1 siz_pla2 hedu_hedu hedu_hhdage hedu_cas1 hedu_cas2 hedu_yea hedu_pla1 hedu_pla2 hhdage_hhdage hhdage_cas1 hhdage_cas2 hhdage_yea hhdage_pla1 hhdage_pla2 cas1_yea cas2_yea cas1_pla1 cas1_pla2 cas2_pla1 cas2_pla2 yea_yea pla1_yea pla2_yea


* we now run stepwise forward regression on treatment variable 

stepwise, lr  pe(0.05) lockterm1: logit bihar (resp_age resp_edu i.religion i.wealth_index i.hhd_gender hh_size husb_edu hhd_age i.caste year_marr i.place_residence) age_age edu_age (rel1_age rel2_age) (wi1_age wi2_age wi3_age wi4_age wi5_age) (hdgen1_age hdgen2_age) siz_age hedu_age hhdage_age (cas1_age cas2_age) yea_age (pla1_age pla2_age) edu_edu (rel1_edu rel2_edu) (wi1_edu wi2_edu wi3_edu wi4_edu wi5_edu) (hdgen1_edu hdgen2_edu) edu_siz hedu_edu hhdage_edu (cas1_edu cas2_edu) edu_yea (edu_pla1 edu_pla2) (rel1_wi1 rel1_wi2 rel1_wi3 rel1_wi4 rel1_wi5 rel2_wi1 rel2_wi2 rel2_wi3 rel2_wi4 rel2_wi5) (hdgen1_rel1 hdgen1_rel2 hdgen2_rel1 hdgen2_rel2) (rel1_siz rel2_siz) (hedu_rel1 hedu_rel2) (hhdage_rel1 hhdage_rel2) (cas1_rel1 cas1_rel2 cas2_rel1 cas2_rel2) (rel1_yea rel2_yea) (rel1_pla1 rel1_pla2 rel2_pla1 rel2_pla2) (hdgen1_wi1 hdgen1_wi2 hdgen1_wi3 hdgen1_wi4 hdgen1_wi5 hdgen2_wi1 hdgen2_wi2 hdgen2_wi3 hdgen2_wi4 hdgen2_wi5) (wi1_siz wi2_siz wi3_siz wi4_siz wi5_siz) (hedu_wi1 hedu_wi2 hedu_wi3 hedu_wi4 hedu_wi5) (hhdage_wi1 hhdage_wi2 hhdage_wi3 hhdage_wi4 hhdage_wi5) (cas1_wi1 cas1_wi2 cas1_wi3 cas1_wi4 cas1_wi5 cas2_wi1 cas2_wi2 cas2_wi3 cas2_wi4 cas2_wi5) (wi1_yea wi2_yea wi3_yea wi4_yea wi5_yea) (wi1_pla1 wi1_pla2 wi2_pla1 wi2_pla2 wi3_pla1 wi3_pla2 wi4_pla1 wi4_pla2 wi5_pla1 wi5_pla2) (hdgen1_siz hdgen2_siz) (hdgen1_hedu hdgen2_hedu) (hdgen1_hhdage hdgen2_hhdage) (cas1_hdgen1 cas1_hdgen2 cas2_hdgen1 cas2_hdgen2) (hdgen1_yea hdgen2_yea) (hdgen1_pla1 hdgen1_pla2 hdgen2_pla1 hdgen2_pla2) siz_siz hedu_siz hhdage_size (cas1_siz cas2_siz) siz_yea (siz_pla1 siz_pla2) hedu_hedu hedu_hhdage (hedu_cas1 hedu_cas2) hedu_yea (hedu_pla1 hedu_pla2) hhdage_hhdage (hhdage_cas1 hhdage_cas2) hhdage_yea (hhdage_pla1 hhdage_pla2) (cas1_yea cas2_yea) (cas1_pla1 cas1_pla2 cas2_pla1 cas2_pla2) yea_yea (pla1_yea pla2_yea), or 


note: rel2_age omitted because of estimability.
note: wi5_age omitted because of estimability.
note: hdgen2_age omitted because of estimability.
note: cas2_age omitted because of estimability.
note: pla2_age omitted because of estimability.
note: rel2_edu omitted because of estimability.
note: wi5_edu omitted because of estimability.
note: hdgen2_edu omitted because of estimability.
note: cas2_edu omitted because of estimability.
note: edu_pla2 omitted because of estimability.
note: rel1_wi5 omitted because of estimability.
note: rel2_wi1 omitted because of estimability.
note: rel2_wi2 omitted because of estimability.
note: rel2_wi3 omitted because of estimability.
note: rel2_wi4 omitted because of estimability.
note: o.rel2_wi5 omitted because of estimability.
note: hdgen1_rel2 omitted because of estimability.
note: hdgen2_rel1 omitted because of estimability.
note: o.hdgen2_rel2 omitted because of estimability.
note: rel2_siz omitted because of estimability.
note: hedu_rel2 omitted because of estimability.
note: hhdage_rel2 omitted because of estimability.
note: cas1_rel2 omitted because of estimability.
note: cas2_rel1 omitted because of estimability.
note: o.cas2_rel2 omitted because of estimability.
note: rel2_yea omitted because of estimability.
note: rel1_pla2 omitted because of estimability.
note: rel2_pla1 omitted because of estimability.
note: o.rel2_pla2 omitted because of estimability.
note: hdgen1_wi5 omitted because of estimability.
note: hdgen2_wi1 omitted because of estimability.
note: hdgen2_wi2 omitted because of estimability.
note: hdgen2_wi3 omitted because of estimability.
note: hdgen2_wi4 omitted because of estimability.
note: o.hdgen2_wi5 omitted because of estimability.
note: wi5_siz omitted because of estimability.
note: hedu_wi5 omitted because of estimability.
note: hhdage_wi5 omitted because of estimability.
note: cas1_wi5 omitted because of estimability.
note: cas2_wi1 omitted because of estimability.
note: cas2_wi2 omitted because of estimability.
note: cas2_wi3 omitted because of estimability.
note: cas2_wi4 omitted because of estimability.
note: o.cas2_wi5 omitted because of estimability.
note: wi5_yea omitted because of estimability.
note: wi1_pla2 omitted because of estimability.
note: wi2_pla2 omitted because of estimability.
note: wi3_pla2 omitted because of estimability.
note: wi4_pla2 omitted because of estimability.
note: wi5_pla1 omitted because of estimability.
note: o.wi5_pla2 omitted because of estimability.
note: hdgen2_siz omitted because of estimability.
note: hdgen2_hedu omitted because of estimability.
note: hdgen2_hhdage omitted because of estimability.
note: cas1_hdgen2 omitted because of estimability.
note: cas2_hdgen1 omitted because of estimability.
note: o.cas2_hdgen2 omitted because of estimability.
note: hdgen2_yea omitted because of estimability.
note: hdgen1_pla2 omitted because of estimability.
note: hdgen2_pla1 omitted because of estimability.
note: o.hdgen2_pla2 omitted because of estimability.
note: cas2_siz omitted because of estimability.
note: siz_pla2 omitted because of estimability.
note: hedu_cas2 omitted because of estimability.
note: hedu_pla2 omitted because of estimability.
note: hhdage_cas2 omitted because of estimability.
note: hhdage_pla2 omitted because of estimability.
note: cas2_yea omitted because of estimability.
note: cas1_pla2 omitted because of estimability.
note: cas2_pla1 omitted because of estimability.
note: o.cas2_pla2 omitted because of estimability.
note: pla2_yea omitted because of estimability.

LR test, begin with term 1 model:
p = 0.0000 <  0.0500, adding cas1_rel1
p = 0.0000 <  0.0500, adding hedu_hedu
p = 0.0000 <  0.0500, adding hdgen1_hhdage
p = 0.0000 <  0.0500, adding edu_yea
p = 0.0014 <  0.0500, adding wi1_age wi2_age wi3_age wi4_age
p = 0.0010 <  0.0500, adding yea_yea
p = 0.0046 <  0.0500, adding hedu_age
p = 0.0047 <  0.0500, adding wi1_pla1 wi2_pla1 wi3_pla1 wi4_pla1
p = 0.0068 <  0.0500, adding hedu_cas1
p = 0.0071 <  0.0500, adding edu_edu
p = 0.0104 <  0.0500, adding rel1_wi1 rel1_wi2 rel1_wi3 rel1_wi4
p = 0.0145 <  0.0500, adding hedu_edu
p = 0.0026 <  0.0500, adding hedu_wi1 hedu_wi2 hedu_wi3 hedu_wi4
p = 0.0161 <  0.0500, adding edu_pla1
p = 0.0170 <  0.0500, adding cas1_yea
p = 0.0097 <  0.0500, adding cas1_age

Logistic regression                                    Number of obs =   6,413
                                                       LR chi2(42)   = 1237.94
                                                       Prob > chi2   =  0.0000
Log likelihood = -3674.5592                            Pseudo R2     =  0.1442

-----------------------------------------------------------------------------------
            bihar | Odds ratio   Std. err.      z    P>|z|     [95% conf. interval]
------------------+----------------------------------------------------------------
         resp_age |   .9149007   .0236317    -3.44   0.001     .8697363    .9624106
         resp_edu |   .8196361   .0275515    -5.92   0.000     .7673766    .8754546
       1.religion |    38.0708   18.33714     7.56   0.000     14.81164     97.8545
                  |
     wealth_index |
               2  |   .2580284   .1072796    -3.26   0.001     .1142264     .582866
               3  |   .4284076   .2174671    -1.67   0.095     .1584063    1.158622
               4  |    .503663   .3386034    -1.02   0.308      .134863    1.880993
               5  |   1.971564   1.765854     0.76   0.449     .3407394    11.40773
                  |
     1.hhd_gender |   8.752142   2.182934     8.70   0.000      5.36797    14.26982
          hh_size |   1.106861   .0161597     6.95   0.000     1.075638    1.138991
         husb_edu |   .9768353   .0630043    -0.36   0.716     .8608356    1.108466
          hhd_age |    .979117   .0051747    -3.99   0.000     .9690271     .989312
          1.caste |   .7538427   .3117449    -0.68   0.494     .3351768     1.69546
        year_marr |   .9458865   .0202079    -2.60   0.009     .9070975    .9863343
1.place_residence |   .5572913   .2108335    -1.55   0.122     .2654955    1.169788
        cas1_rel1 |   6.868134   1.344389     9.84   0.000     4.679743    10.07988
        hedu_hedu |   1.008484   .0018179     4.69   0.000     1.004928    1.012054
    hdgen1_hhdage |   1.028129    .005817     4.90   0.000     1.016791    1.039593
          edu_yea |   1.006283   .0010756     5.86   0.000     1.004177    1.008394
          wi1_age |   1.076154   .0212455     3.72   0.000     1.035309     1.11861
          wi2_age |   1.095009   .0210397     4.72   0.000     1.054538    1.137032
          wi3_age |   1.071206   .0207168     3.56   0.000     1.031361    1.112589
          wi4_age |   1.065439   .0216436     3.12   0.002     1.023852    1.108716
          yea_yea |   1.001472   .0004105     3.59   0.000     1.000668    1.002277
         hedu_age |   1.001014   .0009711     1.04   0.296     .9991127    1.002919
         wi1_pla1 |   .4416918   .1818883    -1.98   0.047     .1970585    .9900189
         wi2_pla1 |   .9106615   .3540747    -0.24   0.810     .4250117     1.95125
         wi3_pla1 |   .6003444    .222678    -1.38   0.169     .2901852    1.242012
         wi4_pla1 |   .6167694    .229004    -1.30   0.193     .2979032     1.27694
        hedu_cas1 |   .9549434   .0128517    -3.43   0.001     .9300839    .9804674
          edu_edu |   1.008316   .0020141     4.15   0.000     1.004376    1.012271
         rel1_wi1 |   3.920537   1.805609     2.97   0.003      1.58973     9.66869
         rel1_wi2 |   4.835272   2.299106     3.31   0.001     1.904084    12.27879
         rel1_wi3 |   3.983377   1.941954     2.84   0.005      1.53207    10.35677
         rel1_wi4 |   4.210196    2.14126     2.83   0.005     1.553783    11.40812
         hedu_edu |   .9923566   .0022941    -3.32   0.001     .9878705    .9968631
         hedu_wi1 |    .934586   .0427064    -1.48   0.139     .8545218    1.022152
         hedu_wi2 |   .9500756   .0423661    -1.15   0.251     .8705648    1.036848
         hedu_wi3 |   1.020765   .0451417     0.46   0.642     .9360152    1.113189
         hedu_wi4 |   1.011389   .0464131     0.25   0.805     .9243915    1.106573
         edu_pla1 |   1.046203   .0192632     2.45   0.014     1.009121    1.084648
         cas1_yea |    .938995   .0177377    -3.33   0.001     .9048653    .9744119
         cas1_age |   1.052278   .0207952     2.58   0.010     1.012299    1.093835
            _cons |    .097303   .0700604    -3.24   0.001     .0237269    .3990351
-----------------------------------------------------------------------------------
Note: _cons estimates baseline odds.


. 
* we get  cas1_rel1 hedu_hedu hdgen1_hhdage edu_yea wi1_age wi2_age wi3_age wi4_age yea_yea hedu_age wi1_pla1 wi2_pla1 wi3_pla1 wi4_pla1 hedu_cas1 edu_edu rel1_wi1 rel1_wi2 rel1_wi3 rel1_wi4 hedu_edu hedu_wi1 hedu_wi2  hedu_wi3 hedu_wi4  edu_pla1  cas1_yea  cas1_age (28 terms)

* we now run stepwise backward regression on treatment variable 

stepwise, lr  pr(0.05) lockterm1: logit bihar (resp_age resp_edu i.religion i.wealth_index i.hhd_gender hh_size husb_edu hhd_age i.caste year_marr i.place_residence) age_age edu_age (rel1_age rel2_age) (wi1_age wi2_age wi3_age wi4_age wi5_age) (hdgen1_age hdgen2_age) siz_age hedu_age hhdage_age (cas1_age cas2_age) yea_age (pla1_age pla2_age) edu_edu (rel1_edu rel2_edu) (wi1_edu wi2_edu wi3_edu wi4_edu wi5_edu) (hdgen1_edu hdgen2_edu) edu_siz hedu_edu hhdage_edu (cas1_edu cas2_edu) edu_yea (edu_pla1 edu_pla2) (rel1_wi1 rel1_wi2 rel1_wi3 rel1_wi4 rel1_wi5 rel2_wi1 rel2_wi2 rel2_wi3 rel2_wi4 rel2_wi5) (hdgen1_rel1 hdgen1_rel2 hdgen2_rel1 hdgen2_rel2) (rel1_siz rel2_siz) (hedu_rel1 hedu_rel2) (hhdage_rel1 hhdage_rel2) (cas1_rel1 cas1_rel2 cas2_rel1 cas2_rel2) (rel1_yea rel2_yea) (rel1_pla1 rel1_pla2 rel2_pla1 rel2_pla2) (hdgen1_wi1 hdgen1_wi2 hdgen1_wi3 hdgen1_wi4 hdgen1_wi5 hdgen2_wi1 hdgen2_wi2 hdgen2_wi3 hdgen2_wi4 hdgen2_wi5) (wi1_siz wi2_siz wi3_siz wi4_siz wi5_siz) (hedu_wi1 hedu_wi2 hedu_wi3 hedu_wi4 hedu_wi5) (hhdage_wi1 hhdage_wi2 hhdage_wi3 hhdage_wi4 hhdage_wi5) (cas1_wi1 cas1_wi2 cas1_wi3 cas1_wi4 cas1_wi5 cas2_wi1 cas2_wi2 cas2_wi3 cas2_wi4 cas2_wi5) (wi1_yea wi2_yea wi3_yea wi4_yea wi5_yea) (wi1_pla1 wi1_pla2 wi2_pla1 wi2_pla2 wi3_pla1 wi3_pla2 wi4_pla1 wi4_pla2 wi5_pla1 wi5_pla2) (hdgen1_siz hdgen2_siz) (hdgen1_hedu hdgen2_hedu) (hdgen1_hhdage hdgen2_hhdage) (cas1_hdgen1 cas1_hdgen2 cas2_hdgen1 cas2_hdgen2) (hdgen1_yea hdgen2_yea) (hdgen1_pla1 hdgen1_pla2 hdgen2_pla1 hdgen2_pla2) siz_siz hedu_siz hhdage_size (cas1_siz cas2_siz) siz_yea (siz_pla1 siz_pla2) hedu_hedu hedu_hhdage (hedu_cas1 hedu_cas2) hedu_yea (hedu_pla1 hedu_pla2) hhdage_hhdage (hhdage_cas1 hhdage_cas2) hhdage_yea (hhdage_pla1 hhdage_pla2) (cas1_yea cas2_yea) (cas1_pla1 cas1_pla2 cas2_pla1 cas2_pla2) yea_yea (pla1_yea pla2_yea), or 

note: rel2_age omitted because of estimability.
note: wi5_age omitted because of estimability.
note: hdgen2_age omitted because of estimability.
note: cas2_age omitted because of estimability.
note: pla2_age omitted because of estimability.
note: rel2_edu omitted because of estimability.
note: wi5_edu omitted because of estimability.
note: hdgen2_edu omitted because of estimability.
note: cas2_edu omitted because of estimability.
note: edu_pla2 omitted because of estimability.
note: rel1_wi5 omitted because of estimability.
note: rel2_wi1 omitted because of estimability.
note: rel2_wi2 omitted because of estimability.
note: rel2_wi3 omitted because of estimability.
note: rel2_wi4 omitted because of estimability.
note: o.rel2_wi5 omitted because of estimability.
note: hdgen1_rel2 omitted because of estimability.
note: hdgen2_rel1 omitted because of estimability.
note: o.hdgen2_rel2 omitted because of estimability.
note: rel2_siz omitted because of estimability.
note: hedu_rel2 omitted because of estimability.
note: hhdage_rel2 omitted because of estimability.
note: cas1_rel2 omitted because of estimability.
note: cas2_rel1 omitted because of estimability.
note: o.cas2_rel2 omitted because of estimability.
note: rel2_yea omitted because of estimability.
note: rel1_pla2 omitted because of estimability.
note: rel2_pla1 omitted because of estimability.
note: o.rel2_pla2 omitted because of estimability.
note: hdgen1_wi5 omitted because of estimability.
note: hdgen2_wi1 omitted because of estimability.
note: hdgen2_wi2 omitted because of estimability.
note: hdgen2_wi3 omitted because of estimability.
note: hdgen2_wi4 omitted because of estimability.
note: o.hdgen2_wi5 omitted because of estimability.
note: wi5_siz omitted because of estimability.
note: hedu_wi5 omitted because of estimability.
note: hhdage_wi5 omitted because of estimability.
note: cas1_wi5 omitted because of estimability.
note: cas2_wi1 omitted because of estimability.
note: cas2_wi2 omitted because of estimability.
note: cas2_wi3 omitted because of estimability.
note: cas2_wi4 omitted because of estimability.
note: o.cas2_wi5 omitted because of estimability.
note: wi5_yea omitted because of estimability.
note: wi1_pla2 omitted because of estimability.
note: wi2_pla2 omitted because of estimability.
note: wi3_pla2 omitted because of estimability.
note: wi4_pla2 omitted because of estimability.
note: wi5_pla1 omitted because of estimability.
note: o.wi5_pla2 omitted because of estimability.
note: hdgen2_siz omitted because of estimability.
note: hdgen2_hedu omitted because of estimability.
note: hdgen2_hhdage omitted because of estimability.
note: cas1_hdgen2 omitted because of estimability.
note: cas2_hdgen1 omitted because of estimability.
note: o.cas2_hdgen2 omitted because of estimability.
note: hdgen2_yea omitted because of estimability.
note: hdgen1_pla2 omitted because of estimability.
note: hdgen2_pla1 omitted because of estimability.
note: o.hdgen2_pla2 omitted because of estimability.
note: cas2_siz omitted because of estimability.
note: siz_pla2 omitted because of estimability.
note: hedu_cas2 omitted because of estimability.
note: hedu_pla2 omitted because of estimability.
note: hhdage_cas2 omitted because of estimability.
note: hhdage_pla2 omitted because of estimability.
note: cas2_yea omitted because of estimability.
note: cas1_pla2 omitted because of estimability.
note: cas2_pla1 omitted because of estimability.
note: o.cas2_pla2 omitted because of estimability.
note: pla2_yea omitted because of estimability.

LR test, begin with full model:
p = 0.9612 >= 0.0500, removing hdgen1_edu
p = 0.9448 >= 0.0500, removing siz_yea
p = 0.9042 >= 0.0500, removing hdgen1_wi1 hdgen1_wi2 hdgen1_wi3 hdgen1_wi4
p = 0.8632 >= 0.0500, removing cas1_siz
p = 0.8253 >= 0.0500, removing hedu_siz
p = 0.7857 >= 0.0500, removing siz_siz
p = 0.7117 >= 0.0500, removing hhdage_yea
p = 0.7982 >= 0.0500, removing hhdage_age
p = 0.6689 >= 0.0500, removing hhdage_edu
p = 0.6718 >= 0.0500, removing rel1_edu
p = 0.6593 >= 0.0500, removing cas1_wi1 cas1_wi2 cas1_wi3 cas1_wi4
p = 0.6271 >= 0.0500, removing hdgen1_age
p = 0.6241 >= 0.0500, removing yea_age
p = 0.8960 >= 0.0500, removing age_age
p = 0.5937 >= 0.0500, removing hhdage_pla1
p = 0.5233 >= 0.0500, removing hdgen1_yea
p = 0.5298 >= 0.0500, removing hdgen1_pla1
p = 0.5710 >= 0.0500, removing hdgen1_hedu
p = 0.4664 >= 0.0500, removing hdgen1_rel1
p = 0.4559 >= 0.0500, removing edu_siz
p = 0.4707 >= 0.0500, removing hhdage_hhdage
p = 0.4774 >= 0.0500, removing hhdage_size
p = 0.4380 >= 0.0500, removing hedu_pla1
p = 0.4743 >= 0.0500, removing rel1_pla1
p = 0.3706 >= 0.0500, removing siz_age
p = 0.3562 >= 0.0500, removing wi1_edu wi2_edu wi3_edu wi4_edu
p = 0.3262 >= 0.0500, removing cas1_edu
p = 0.2856 >= 0.0500, removing hhdage_cas1
p = 0.2710 >= 0.0500, removing hedu_yea
p = 0.3602 >= 0.0500, removing hedu_age
p = 0.2711 >= 0.0500, removing hdgen1_siz
p = 0.2765 >= 0.0500, removing hedu_rel1
p = 0.2800 >= 0.0500, removing wi1_siz wi2_siz wi3_siz wi4_siz
p = 0.5265 >= 0.0500, removing hhdage_wi1 hhdage_wi2 hhdage_wi3 hhdage_wi4
p = 0.2590 >= 0.0500, removing pla1_yea
p = 0.1541 >= 0.0500, removing hedu_hhdage
p = 0.0935 >= 0.0500, removing cas1_hdgen1
p = 0.0763 >= 0.0500, removing edu_age
p = 0.1198 >= 0.0500, removing wi1_yea wi2_yea wi3_yea wi4_yea
p = 0.0765 >= 0.0500, removing siz_pla1
p = 0.0873 >= 0.0500, removing pla1_age
p = 0.0785 >= 0.0500, removing cas1_pla1
p = 0.0543 >= 0.0500, removing wi1_pla1 wi2_pla1 wi3_pla1 wi4_pla1

Logistic regression                                    Number of obs =   6,413
                                                       LR chi2(41)   = 1240.09
                                                       Prob > chi2   =  0.0000
Log likelihood = -3673.4853                            Pseudo R2     =  0.1444

-----------------------------------------------------------------------------------
            bihar | Odds ratio   Std. err.      z    P>|z|     [95% conf. interval]
------------------+----------------------------------------------------------------
         resp_age |   .9276748   .0226461    -3.08   0.002     .8843344    .9731394
         resp_edu |   .8101032    .025805    -6.61   0.000     .7610729    .8622922
       1.religion |   11.20789   8.513003     3.18   0.001     2.529237    49.66585
                  |
     wealth_index |
               2  |   .4606584   .1539531    -2.32   0.020     .2392795    .8868549
               3  |   .5389724   .2365058    -1.41   0.159     .2280615    1.273741
               4  |   .5522747   .3333735    -0.98   0.325     .1691758    1.802902
               5  |   2.772427   2.305506     1.23   0.220     .5432685    14.14834
                  |
     1.hhd_gender |   8.813727   2.201215     8.71   0.000     5.402233    14.37957
          hh_size |   1.123245   .0180548     7.23   0.000      1.08841    1.159195
         husb_edu |   1.020107   .0535484     0.38   0.705     .9203727    1.130649
          hhd_age |   .9772721   .0052808    -4.25   0.000     .9669766    .9876773
          1.caste |   .8494583   .3545154    -0.39   0.696     .3748859    1.924797
        year_marr |   .9381907    .020345    -2.94   0.003     .8991508    .9789256
1.place_residence |   .8995674   .1110284    -0.86   0.391     .7062766    1.145757
         rel1_wi1 |   3.971466   1.840456     2.98   0.003     1.601353    9.849507
         rel1_wi2 |   4.683992   2.230759     3.24   0.001     1.841747    11.91248
         rel1_wi3 |   4.033114   1.967209     2.86   0.004     1.550439    10.49123
         rel1_wi4 |   4.454648   2.266228     2.94   0.003     1.643534    12.07391
         rel1_siz |   .9081394   .0342329    -2.56   0.011     .8434628    .9777755
         rel1_age |   .9371564   .0231484    -2.63   0.009     .8928672    .9836425
          wi1_age |   1.073889   .0203151     3.77   0.000     1.034801    1.114453
          wi2_age |    1.09417    .020528     4.80   0.000     1.054667    1.135153
          wi3_age |   1.070876   .0204389     3.59   0.000     1.031557    1.111695
          wi4_age |   1.067739   .0215233     3.25   0.001     1.026376    1.110768
      hhdage_rel1 |   1.013238    .006458     2.06   0.039      1.00066    1.025975
          yea_yea |    1.00147   .0004097     3.59   0.000     1.000667    1.002273
         hedu_wi1 |   .9256653   .0415454    -1.72   0.085     .8477165    1.010782
         hedu_wi2 |    .947821   .0415939    -1.22   0.222      .869706    1.032952
         hedu_wi3 |   1.015251    .044444     0.35   0.730     .9317745    1.106206
         hedu_wi4 |   1.015726   .0464224     0.34   0.733     .9286959    1.110912
         cas1_yea |   .9349104   .0178315    -3.53   0.000     .9006064     .970521
         cas1_age |   1.058842   .0211782     2.86   0.004     1.018136    1.101174
        hedu_hedu |    1.00844    .001812     4.68   0.000     1.004895    1.011998
         edu_pla1 |   1.055738   .0165012     3.47   0.001     1.023887    1.088581
          edu_edu |    1.00866   .0019769     4.40   0.000     1.004793    1.012542
         rel1_yea |   1.059385   .0243037     2.51   0.012     1.012806    1.108107
        hedu_cas1 |   .9534281   .0128424    -3.54   0.000     .9285869    .9789339
        cas1_rel1 |   6.956148   1.372876     9.83   0.000     4.724701    10.24149
          edu_yea |   1.006837     .00099     6.93   0.000     1.004898    1.008779
         hedu_edu |   .9915125   .0021623    -3.91   0.000     .9872835    .9957597
    hdgen1_hhdage |   1.028228   .0058162     4.92   0.000     1.016892    1.039691
            _cons |   .1003585   .0749862    -3.08   0.002     .0232033    .4340683
-----------------------------------------------------------------------------------
Note: _cons estimates baseline odds.


* we get rel1_wi1 rel1_wi2 rel1_wi3 rel1_wi4 rel1_siz rel1_age  wi1_age wi2_age wi3_age wi4_age hhdage_rel1 yea_yea   hedu_wi1  hedu_wi2  hedu_wi3 hedu_wi4 cas1_yea cas1_age  hedu_hedu edu_pla1 edu_edu rel1_yea hedu_cas1  cas1_rel1 edu_yea hedu_edu  hdgen1_hhdage (27 terms) (backward)

* we get  cas1_rel1 hedu_hedu hdgen1_hhdage edu_yea wi1_age wi2_age wi3_age wi4_age yea_yea hedu_age wi1_pla1 wi2_pla1 wi3_pla1 wi4_pla1 hedu_cas1 edu_edu rel1_wi1 rel1_wi2 rel1_wi3 rel1_wi4 hedu_edu hedu_wi1 hedu_wi2  hedu_wi3 hedu_wi4  edu_pla1  cas1_yea  cas1_age (28 terms) (forward)





* common terms in both forward and backward stepwise on treatment indicator are 
cas1_rel1 hedu_hedu hdgen1_hhdage yea_yea edu_yea hedu_edu rel1_wi1 rel1_wi2 rel1_wi3 rel1_wi4 wi1_age wi2_age wi3_age wi4_age  hedu_wi1 hedu_wi2  hedu_wi3 hedu_wi4 edu_edu  cas1_yea cas1_age edu_pla1 hedu_cas1

* we now run forward and backward stepwise regression on outcome variable(ipv_index_norm)
* forward stepwise 
 
stepwise, lr  pe(0.05) lockterm1: regress ipv_index_norm (resp_age resp_edu i.religion i.wealth_index i.hhd_gender hh_size husb_edu hhd_age i.caste year_marr i.place_residence) age_age edu_age (rel1_age rel2_age) (wi1_age wi2_age wi3_age wi4_age wi5_age) (hdgen1_age hdgen2_age) siz_age hedu_age hhdage_age (cas1_age cas2_age) yea_age (pla1_age pla2_age) edu_edu (rel1_edu rel2_edu) (wi1_edu wi2_edu wi3_edu wi4_edu wi5_edu) (hdgen1_edu hdgen2_edu) edu_siz hedu_edu hhdage_edu (cas1_edu cas2_edu) edu_yea (edu_pla1 edu_pla2) (rel1_wi1 rel1_wi2 rel1_wi3 rel1_wi4 rel1_wi5 rel2_wi1 rel2_wi2 rel2_wi3 rel2_wi4 rel2_wi5) (hdgen1_rel1 hdgen1_rel2 hdgen2_rel1 hdgen2_rel2) (rel1_siz rel2_siz) (hedu_rel1 hedu_rel2) (hhdage_rel1 hhdage_rel2) (cas1_rel1 cas1_rel2 cas2_rel1 cas2_rel2) (rel1_yea rel2_yea) (rel1_pla1 rel1_pla2 rel2_pla1 rel2_pla2) (hdgen1_wi1 hdgen1_wi2 hdgen1_wi3 hdgen1_wi4 hdgen1_wi5 hdgen2_wi1 hdgen2_wi2 hdgen2_wi3 hdgen2_wi4 hdgen2_wi5) (wi1_siz wi2_siz wi3_siz wi4_siz wi5_siz) (hedu_wi1 hedu_wi2 hedu_wi3 hedu_wi4 hedu_wi5) (hhdage_wi1 hhdage_wi2 hhdage_wi3 hhdage_wi4 hhdage_wi5) (cas1_wi1 cas1_wi2 cas1_wi3 cas1_wi4 cas1_wi5 cas2_wi1 cas2_wi2 cas2_wi3 cas2_wi4 cas2_wi5) (wi1_yea wi2_yea wi3_yea wi4_yea wi5_yea) (wi1_pla1 wi1_pla2 wi2_pla1 wi2_pla2 wi3_pla1 wi3_pla2 wi4_pla1 wi4_pla2 wi5_pla1 wi5_pla2) (hdgen1_siz hdgen2_siz) (hdgen1_hedu hdgen2_hedu) (hdgen1_hhdage hdgen2_hhdage) (cas1_hdgen1 cas1_hdgen2 cas2_hdgen1 cas2_hdgen2) (hdgen1_yea hdgen2_yea) (hdgen1_pla1 hdgen1_pla2 hdgen2_pla1 hdgen2_pla2) siz_siz hedu_siz hhdage_size (cas1_siz cas2_siz) siz_yea (siz_pla1 siz_pla2) hedu_hedu hedu_hhdage (hedu_cas1 hedu_cas2) hedu_yea (hedu_pla1 hedu_pla2) hhdage_hhdage (hhdage_cas1 hhdage_cas2) hhdage_yea (hhdage_pla1 hhdage_pla2) (cas1_yea cas2_yea) (cas1_pla1 cas1_pla2 cas2_pla1 cas2_pla2) yea_yea (pla1_yea pla2_yea)


note: rel2_age omitted because of estimability.
note: wi5_age omitted because of estimability.
note: hdgen2_age omitted because of estimability.
note: cas2_age omitted because of estimability.
note: pla2_age omitted because of estimability.
note: rel2_edu omitted because of estimability.
note: wi5_edu omitted because of estimability.
note: hdgen2_edu omitted because of estimability.
note: cas2_edu omitted because of estimability.
note: edu_pla2 omitted because of estimability.
note: rel1_wi5 omitted because of estimability.
note: rel2_wi1 omitted because of estimability.
note: rel2_wi2 omitted because of estimability.
note: rel2_wi3 omitted because of estimability.
note: rel2_wi4 omitted because of estimability.
note: o.rel2_wi5 omitted because of estimability.
note: hdgen1_rel2 omitted because of estimability.
note: hdgen2_rel1 omitted because of estimability.
note: o.hdgen2_rel2 omitted because of estimability.
note: rel2_siz omitted because of estimability.
note: hedu_rel2 omitted because of estimability.
note: hhdage_rel2 omitted because of estimability.
note: cas1_rel2 omitted because of estimability.
note: cas2_rel1 omitted because of estimability.
note: o.cas2_rel2 omitted because of estimability.
note: rel2_yea omitted because of estimability.
note: rel1_pla2 omitted because of estimability.
note: rel2_pla1 omitted because of estimability.
note: o.rel2_pla2 omitted because of estimability.
note: hdgen1_wi5 omitted because of estimability.
note: hdgen2_wi1 omitted because of estimability.
note: hdgen2_wi2 omitted because of estimability.
note: hdgen2_wi3 omitted because of estimability.
note: hdgen2_wi4 omitted because of estimability.
note: o.hdgen2_wi5 omitted because of estimability.
note: wi5_siz omitted because of estimability.
note: hedu_wi5 omitted because of estimability.
note: hhdage_wi5 omitted because of estimability.
note: cas1_wi5 omitted because of estimability.
note: cas2_wi1 omitted because of estimability.
note: cas2_wi2 omitted because of estimability.
note: cas2_wi3 omitted because of estimability.
note: cas2_wi4 omitted because of estimability.
note: o.cas2_wi5 omitted because of estimability.
note: wi5_yea omitted because of estimability.
note: wi1_pla2 omitted because of estimability.
note: wi2_pla2 omitted because of estimability.
note: wi3_pla2 omitted because of estimability.
note: wi4_pla2 omitted because of estimability.
note: wi5_pla1 omitted because of estimability.
note: o.wi5_pla2 omitted because of estimability.
note: hdgen2_siz omitted because of estimability.
note: hdgen2_hedu omitted because of estimability.
note: hdgen2_hhdage omitted because of estimability.
note: cas1_hdgen2 omitted because of estimability.
note: cas2_hdgen1 omitted because of estimability.
note: o.cas2_hdgen2 omitted because of estimability.
note: hdgen2_yea omitted because of estimability.
note: hdgen1_pla2 omitted because of estimability.
note: hdgen2_pla1 omitted because of estimability.
note: o.hdgen2_pla2 omitted because of estimability.
note: cas2_siz omitted because of estimability.
note: siz_pla2 omitted because of estimability.
note: hedu_cas2 omitted because of estimability.
note: hedu_pla2 omitted because of estimability.
note: hhdage_cas2 omitted because of estimability.
note: hhdage_pla2 omitted because of estimability.
note: cas2_yea omitted because of estimability.
note: cas1_pla2 omitted because of estimability.
note: cas2_pla1 omitted because of estimability.
note: o.cas2_pla2 omitted because of estimability.
note: pla2_yea omitted because of estimability.

LR test, begin with term 1 model:
p = 0.0001 <  0.0500, adding cas1_hdgen1
p = 0.0103 <  0.0500, adding hdgen1_siz
p = 0.0108 <  0.0500, adding cas1_rel1
p = 0.0091 <  0.0500, adding yea_yea
p = 0.0402 <  0.0500, adding hedu_hedu

      Source |       SS           df       MS      Number of obs   =     6,358
-------------+----------------------------------   F(19, 6338)     =     17.96
       Model |  623.967949        19  32.8404184   Prob > F        =    0.0000
    Residual |  11588.5773     6,338  1.82842809   R-squared       =    0.0511
-------------+----------------------------------   Adj R-squared   =    0.0482
       Total |  12212.5452     6,357   1.9211177   Root MSE        =    1.3522

-----------------------------------------------------------------------------------
   ipv_index_norm | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
------------------+----------------------------------------------------------------
         resp_age |  -.0191612   .0055011    -3.48   0.000    -.0299452   -.0083772
         resp_edu |  -.0301514   .0051751    -5.83   0.000    -.0402962   -.0200065
       1.religion |   .3347523    .074903     4.47   0.000      .187917    .4815876
                  |
     wealth_index |
               2  |  -.0024907   .0459513    -0.05   0.957    -.0925707    .0875893
               3  |  -.1595629   .0594418    -2.68   0.007    -.2760889   -.0430369
               4  |  -.1205667   .0770773    -1.56   0.118    -.2716643    .0305309
               5  |  -.3652793   .1038395    -3.52   0.000    -.5688399   -.1617186
                  |
     1.hhd_gender |   .6576907   .1321404     4.98   0.000     .3986507    .9167307
          hh_size |  -.0281691   .0200262    -1.41   0.160    -.0674273    .0110891
         husb_edu |  -.0332455   .0102044    -3.26   0.001    -.0532496   -.0132413
          hhd_age |  -.0019176   .0014053    -1.36   0.172    -.0046724    .0008373
          1.caste |   .3761904   .0884084     4.26   0.000     .2028801    .5495007
        year_marr |   .0274409   .0088335     3.11   0.002     .0101243    .0447576
1.place_residence |   .0468204   .0534824     0.88   0.381    -.0580231     .151664
      cas1_hdgen1 |   .3884987   .0951525     4.08   0.000     .2019677    .5750298
       hdgen1_siz |   .0545527   .0213088     2.56   0.010     .0127802    .0963251
        cas1_rel1 |   .2371906   .0926501     2.56   0.010      .055565    .4188161
          yea_yea |  -.0005771   .0002226    -2.59   0.010    -.0010134   -.0001408
        hedu_hedu |    .001533   .0007483     2.05   0.041     .0000661    .0029998
            _cons |   .2578479    .181244     1.42   0.155    -.0974516    .6131474
-----------------------------------------------------------------------------------

* we get  cas1_hdgen1 hdgen1_siz  cas1_rel1 yea_yea hedu_hedu (5 terms) (forward)



* backward stepwise on outcome variable 

stepwise, lr  pr(0.05) lockterm1: regress ipv_index_norm (resp_age resp_edu i.religion i.wealth_index i.hhd_gender hh_size husb_edu hhd_age i.caste year_marr i.place_residence) age_age edu_age (rel1_age rel2_age) (wi1_age wi2_age wi3_age wi4_age wi5_age) (hdgen1_age hdgen2_age) siz_age hedu_age hhdage_age (cas1_age cas2_age) yea_age (pla1_age pla2_age) edu_edu (rel1_edu rel2_edu) (wi1_edu wi2_edu wi3_edu wi4_edu wi5_edu) (hdgen1_edu hdgen2_edu) edu_siz hedu_edu hhdage_edu (cas1_edu cas2_edu) edu_yea (edu_pla1 edu_pla2) (rel1_wi1 rel1_wi2 rel1_wi3 rel1_wi4 rel1_wi5 rel2_wi1 rel2_wi2 rel2_wi3 rel2_wi4 rel2_wi5) (hdgen1_rel1 hdgen1_rel2 hdgen2_rel1 hdgen2_rel2) (rel1_siz rel2_siz) (hedu_rel1 hedu_rel2) (hhdage_rel1 hhdage_rel2) (cas1_rel1 cas1_rel2 cas2_rel1 cas2_rel2) (rel1_yea rel2_yea) (rel1_pla1 rel1_pla2 rel2_pla1 rel2_pla2) (hdgen1_wi1 hdgen1_wi2 hdgen1_wi3 hdgen1_wi4 hdgen1_wi5 hdgen2_wi1 hdgen2_wi2 hdgen2_wi3 hdgen2_wi4 hdgen2_wi5) (wi1_siz wi2_siz wi3_siz wi4_siz wi5_siz) (hedu_wi1 hedu_wi2 hedu_wi3 hedu_wi4 hedu_wi5) (hhdage_wi1 hhdage_wi2 hhdage_wi3 hhdage_wi4 hhdage_wi5) (cas1_wi1 cas1_wi2 cas1_wi3 cas1_wi4 cas1_wi5 cas2_wi1 cas2_wi2 cas2_wi3 cas2_wi4 cas2_wi5) (wi1_yea wi2_yea wi3_yea wi4_yea wi5_yea) (wi1_pla1 wi1_pla2 wi2_pla1 wi2_pla2 wi3_pla1 wi3_pla2 wi4_pla1 wi4_pla2 wi5_pla1 wi5_pla2) (hdgen1_siz hdgen2_siz) (hdgen1_hedu hdgen2_hedu) (hdgen1_hhdage hdgen2_hhdage) (cas1_hdgen1 cas1_hdgen2 cas2_hdgen1 cas2_hdgen2) (hdgen1_yea hdgen2_yea) (hdgen1_pla1 hdgen1_pla2 hdgen2_pla1 hdgen2_pla2) siz_siz hedu_siz hhdage_size (cas1_siz cas2_siz) siz_yea (siz_pla1 siz_pla2) hedu_hedu hedu_hhdage (hedu_cas1 hedu_cas2) hedu_yea (hedu_pla1 hedu_pla2) hhdage_hhdage (hhdage_cas1 hhdage_cas2) hhdage_yea (hhdage_pla1 hhdage_pla2) (cas1_yea cas2_yea) (cas1_pla1 cas1_pla2 cas2_pla1 cas2_pla2) yea_yea (pla1_yea pla2_yea)


note: rel2_age omitted because of estimability.
note: wi5_age omitted because of estimability.
note: hdgen2_age omitted because of estimability.
note: cas2_age omitted because of estimability.
note: pla2_age omitted because of estimability.
note: rel2_edu omitted because of estimability.
note: wi5_edu omitted because of estimability.
note: hdgen2_edu omitted because of estimability.
note: cas2_edu omitted because of estimability.
note: edu_pla2 omitted because of estimability.
note: rel1_wi5 omitted because of estimability.
note: rel2_wi1 omitted because of estimability.
note: rel2_wi2 omitted because of estimability.
note: rel2_wi3 omitted because of estimability.
note: rel2_wi4 omitted because of estimability.
note: o.rel2_wi5 omitted because of estimability.
note: hdgen1_rel2 omitted because of estimability.
note: hdgen2_rel1 omitted because of estimability.
note: o.hdgen2_rel2 omitted because of estimability.
note: rel2_siz omitted because of estimability.
note: hedu_rel2 omitted because of estimability.
note: hhdage_rel2 omitted because of estimability.
note: cas1_rel2 omitted because of estimability.
note: cas2_rel1 omitted because of estimability.
note: o.cas2_rel2 omitted because of estimability.
note: rel2_yea omitted because of estimability.
note: rel1_pla2 omitted because of estimability.
note: rel2_pla1 omitted because of estimability.
note: o.rel2_pla2 omitted because of estimability.
note: hdgen1_wi5 omitted because of estimability.
note: hdgen2_wi1 omitted because of estimability.
note: hdgen2_wi2 omitted because of estimability.
note: hdgen2_wi3 omitted because of estimability.
note: hdgen2_wi4 omitted because of estimability.
note: o.hdgen2_wi5 omitted because of estimability.
note: wi5_siz omitted because of estimability.
note: hedu_wi5 omitted because of estimability.
note: hhdage_wi5 omitted because of estimability.
note: cas1_wi5 omitted because of estimability.
note: cas2_wi1 omitted because of estimability.
note: cas2_wi2 omitted because of estimability.
note: cas2_wi3 omitted because of estimability.
note: cas2_wi4 omitted because of estimability.
note: o.cas2_wi5 omitted because of estimability.
note: wi5_yea omitted because of estimability.
note: wi1_pla2 omitted because of estimability.
note: wi2_pla2 omitted because of estimability.
note: wi3_pla2 omitted because of estimability.
note: wi4_pla2 omitted because of estimability.
note: wi5_pla1 omitted because of estimability.
note: o.wi5_pla2 omitted because of estimability.
note: hdgen2_siz omitted because of estimability.
note: hdgen2_hedu omitted because of estimability.
note: hdgen2_hhdage omitted because of estimability.
note: cas1_hdgen2 omitted because of estimability.
note: cas2_hdgen1 omitted because of estimability.
note: o.cas2_hdgen2 omitted because of estimability.
note: hdgen2_yea omitted because of estimability.
note: hdgen1_pla2 omitted because of estimability.
note: hdgen2_pla1 omitted because of estimability.
note: o.hdgen2_pla2 omitted because of estimability.
note: cas2_siz omitted because of estimability.
note: siz_pla2 omitted because of estimability.
note: hedu_cas2 omitted because of estimability.
note: hedu_pla2 omitted because of estimability.
note: hhdage_cas2 omitted because of estimability.
note: hhdage_pla2 omitted because of estimability.
note: cas2_yea omitted because of estimability.
note: cas1_pla2 omitted because of estimability.
note: cas2_pla1 omitted because of estimability.
note: o.cas2_pla2 omitted because of estimability.
note: pla2_yea omitted because of estimability.

LR test, begin with full model:
p = 0.9990 >= 0.0500, removing hhdage_pla1
p = 0.9851 >= 0.0500, removing edu_edu
p = 0.9123 >= 0.0500, removing rel1_yea
p = 0.8972 >= 0.0500, removing rel1_wi1 rel1_wi2 rel1_wi3 rel1_wi4
p = 0.9393 >= 0.0500, removing hedu_rel1
p = 0.8735 >= 0.0500, removing hhdage_edu
p = 0.8389 >= 0.0500, removing cas1_siz
p = 0.7948 >= 0.0500, removing siz_pla1
p = 0.7734 >= 0.0500, removing hedu_age
p = 0.7745 >= 0.0500, removing hhdage_wi1 hhdage_wi2 hhdage_wi3 hhdage_wi4
p = 0.7241 >= 0.0500, removing hhdage_hhdage
p = 0.7282 >= 0.0500, removing hdgen1_yea
p = 0.7313 >= 0.0500, removing hdgen1_rel1
p = 0.7147 >= 0.0500, removing rel1_age
p = 0.7078 >= 0.0500, removing wi1_yea wi2_yea wi3_yea wi4_yea
p = 0.8096 >= 0.0500, removing edu_age
p = 0.7069 >= 0.0500, removing cas1_pla1
p = 0.6552 >= 0.0500, removing yea_age
p = 0.6580 >= 0.0500, removing hdgen1_edu
p = 0.6060 >= 0.0500, removing cas1_wi1 cas1_wi2 cas1_wi3 cas1_wi4
p = 0.5611 >= 0.0500, removing siz_siz
p = 0.5299 >= 0.0500, removing wi1_edu wi2_edu wi3_edu wi4_edu
p = 0.4481 >= 0.0500, removing rel1_edu
p = 0.4509 >= 0.0500, removing hdgen1_hhdage
p = 0.3978 >= 0.0500, removing wi1_age wi2_age wi3_age wi4_age
p = 0.4163 >= 0.0500, removing cas1_edu
p = 0.4120 >= 0.0500, removing hedu_cas1
p = 0.3697 >= 0.0500, removing hedu_edu
p = 0.4484 >= 0.0500, removing hedu_hedu
p = 0.3379 >= 0.0500, removing hdgen1_pla1
p = 0.3244 >= 0.0500, removing rel1_siz
p = 0.2788 >= 0.0500, removing edu_pla1
p = 0.2710 >= 0.0500, removing wi1_pla1 wi2_pla1 wi3_pla1 wi4_pla1
p = 0.2317 >= 0.0500, removing hhdage_cas1
p = 0.2903 >= 0.0500, removing cas1_age
p = 0.2209 >= 0.0500, removing edu_siz
p = 0.2137 >= 0.0500, removing hedu_wi1 hedu_wi2 hedu_wi3 hedu_wi4
p = 0.1779 >= 0.0500, removing cas1_yea
p = 0.1724 >= 0.0500, removing hdgen1_hedu
p = 0.2503 >= 0.0500, removing hdgen1_wi1 hdgen1_wi2 hdgen1_wi3 hdgen1_wi4
p = 0.1580 >= 0.0500, removing hhdage_size
p = 0.1073 >= 0.0500, removing age_age
p = 0.1051 >= 0.0500, removing hhdage_age
p = 0.1662 >= 0.0500, removing siz_age
p = 0.2427 >= 0.0500, removing siz_yea
p = 0.0858 >= 0.0500, removing pla1_age
p = 0.1234 >= 0.0500, removing hedu_pla1
p = 0.1613 >= 0.0500, removing pla1_yea
p = 0.1200 >= 0.0500, removing rel1_pla1
p = 0.0914 >= 0.0500, removing hhdage_rel1
p = 0.0760 >= 0.0500, removing hhdage_yea
p = 0.1067 >= 0.0500, removing hdgen1_age
p = 0.0637 >= 0.0500, removing edu_yea
p = 0.2837 >= 0.0500, removing hedu_yea
p = 0.0586 >= 0.0500, removing hedu_siz
p = 0.1501 >= 0.0500, removing wi1_siz wi2_siz wi3_siz wi4_siz
p = 0.0580 >= 0.0500, removing hedu_hhdage

      Source |       SS           df       MS      Number of obs   =     6,358
-------------+----------------------------------   F(18, 6339)     =     18.72
       Model |   616.29415        18  34.2385639   Prob > F        =    0.0000
    Residual |  11596.2511     6,339  1.82935022   R-squared       =    0.0505
-------------+----------------------------------   Adj R-squared   =    0.0478
       Total |  12212.5452     6,357   1.9211177   Root MSE        =    1.3525

-----------------------------------------------------------------------------------
   ipv_index_norm | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
------------------+----------------------------------------------------------------
         resp_age |  -.0181671    .005481    -3.31   0.001    -.0289117   -.0074225
         resp_edu |  -.0284105   .0051061    -5.56   0.000    -.0384202   -.0184008
       1.religion |   .3389288   .0748942     4.53   0.000     .1921108    .4857467
                  |
     wealth_index |
               2  |  -.0089381   .0458549    -0.19   0.845    -.0988292    .0809531
               3  |  -.1596151   .0594567    -2.68   0.007    -.2761705   -.0430598
               4  |  -.1099999   .0769239    -1.43   0.153    -.2607968     .040797
               5  |  -.3272315    .102191    -3.20   0.001    -.5275604   -.1269027
                  |
     1.hhd_gender |   .6587655   .1321727     4.98   0.000     .3996622    .9178687
          hh_size |  -.0287611   .0200292    -1.44   0.151    -.0680251    .0105029
         husb_edu |  -.0144612   .0044797    -3.23   0.001     -.023243   -.0056795
          hhd_age |  -.0017948   .0014044    -1.28   0.201    -.0045479    .0009582
          1.caste |   .3770471   .0884297     4.26   0.000      .203695    .5503992
        year_marr |   .0269855   .0088329     3.06   0.002       .00967     .044301
1.place_residence |   .0427472   .0534589     0.80   0.424    -.0620502    .1475447
        cas1_rel1 |   .2419274   .0926446     2.61   0.009     .0603126    .4235421
          yea_yea |  -.0005802   .0002226    -2.61   0.009    -.0010166   -.0001438
      cas1_hdgen1 |   .3889118   .0951763     4.09   0.000     .2023342    .5754895
       hdgen1_siz |   .0543594   .0213139     2.55   0.011     .0125769    .0961419
            _cons |   .2036136   .1793454     1.14   0.256    -.1479639    .5551912
-----------------------------------------------------------------------------------

. 
* we get  cas1_rel1  yea_yea cas1_hdgen1  hdgen1_siz (4 terms) (backward)

* we get  cas1_hdgen1 hdgen1_siz  cas1_rel1 yea_yea hedu_hedu (5 terms) (forward)




*common terms in both forward and backward regression on outcome variable  (4 terms)
 
cas1_rel1  yea_yea cas1_hdgen1 hdgen1_siz 





* common terms in both forward and backward stepwise on treatment indicator 

cas1_rel1 hedu_hedu hdgen1_hhdage yea_yea edu_yea hedu_edu rel1_wi1 rel1_wi2 rel1_wi3 rel1_wi4 wi1_age wi2_age wi3_age wi4_age  hedu_wi1 hedu_wi2  hedu_wi3 hedu_wi4 edu_edu  cas1_yea cas1_age edu_pla1 hedu_cas1


* we now include all interaction terms in either set of regressions ( 2 common, 2 in first and rest in second) (25 terms)

cas1_rel1 hedu_hedu hdgen1_hhdage yea_yea edu_yea hedu_edu rel1_wi1 rel1_wi2 rel1_wi3 rel1_wi4 wi1_age wi2_age wi3_age wi4_age  hedu_wi1 hedu_wi2  hedu_wi3 hedu_wi4 edu_edu  cas1_yea cas1_age edu_pla1 hedu_cas1  cas1_hdgen1 hdgen1_siz 




* so we have 11 linear and 25 non linear terms from algortihmic stepwise regression

resp_age resp_edu religion wealth_index hhd_gender hh_size husb_edu hhd_age caste year_marr place_residence cas1_rel1 hedu_hedu hdgen1_hhdage yea_yea edu_yea hedu_edu rel1_wi1 rel1_wi2 rel1_wi3 rel1_wi4 wi1_age wi2_age wi3_age wi4_age  hedu_wi1 hedu_wi2  hedu_wi3 hedu_wi4 edu_edu  cas1_yea cas1_age edu_pla1 hedu_cas1  cas1_hdgen1 hdgen1_siz


* Consequently, any control variable selected in the baseline selection model that displayed multi-collinearity in the endline data should be dropped from the selection model.
* we open the endline dataset and first create interaction and quadratic terms (we create all interaction terms here but we will need only 25 of them)

* open endline dataset 

* generate all interaction and quadratic terms 

* we first create dummies for each categorical variable 

tabulate religion, generate(rel)
tabulate wealth_index, generate(wi)
tabulate hhd_gender, generate(hdgen)
tabulate caste, generate(cas)
tabulate place_residence, generate(pla)

* now we create interaction terms 
 
gen age_age = resp_age*resp_age
gen edu_age = resp_edu*resp_age
gen rel1_age =  (rel1)*resp_age
gen rel2_age =  (rel2)*resp_age
gen wi1_age = (wi1)*resp_age 
gen wi2_age = (wi2)*resp_age 
gen wi3_age = (wi3)*resp_age 
gen wi4_age = (wi4)*resp_age 
gen wi5_age = (wi5)*resp_age 
gen hdgen1_age = hdgen1*resp_age
gen hdgen2_age = hdgen2*resp_age
gen siz_age =  hh_size*resp_age 
gen hedu_age = husb_edu*resp_age
gen hhdage_age = hhd_age*resp_age
gen cas1_age =  cas1*resp_age
gen cas2_age =  cas2*resp_age
gen yea_age =  year_marr*resp_age
gen pla1_age =  pla1*resp_age
gen pla2_age =  pla2*resp_age

gen edu_edu =  resp_edu*resp_edu 
gen rel1_edu =  rel1*resp_edu
gen rel2_edu =  rel2*resp_edu
gen wi1_edu = wi1*resp_edu
gen wi2_edu = wi2*resp_edu
gen wi3_edu = wi3*resp_edu
gen wi4_edu = wi4*resp_edu
gen wi5_edu = wi5*resp_edu
gen hdgen1_edu = hdgen1*resp_edu
gen hdgen2_edu = hdgen2*resp_edu
gen edu_siz =  resp_edu*hh_size 
gen hedu_edu = husb_edu*resp_edu
gen hhdage_edu = hhd_age*resp_edu
gen cas1_edu = cas1*resp_edu
gen cas2_edu = cas2*resp_edu
gen edu_yea = resp_edu*year_marr 
gen edu_pla1 = resp_edu*pla1
gen edu_pla2 = resp_edu*pla2


gen rel1_wi1 =  rel1*wi1
gen rel1_wi2 =  rel1*wi2
gen rel1_wi3 =  rel1*wi3
gen rel1_wi4 =  rel1*wi4
gen rel1_wi5 =  rel1*wi5
gen rel2_wi1 =  rel2*wi1
gen rel2_wi2 =  rel2*wi2
gen rel2_wi3 =  rel2*wi3
gen rel2_wi4 =  rel2*wi4
gen rel2_wi5 =  rel2*wi5
gen hdgen1_rel1 =  hdgen1*rel1
gen hdgen1_rel2 =  hdgen1*rel2
gen hdgen2_rel1 =  hdgen2*rel1
gen hdgen2_rel2 =  hdgen2*rel2
gen rel1_siz =  rel1*hh_size
gen rel2_siz =  rel2*hh_size
gen hedu_rel1 = husb_edu*rel1
gen hedu_rel2 = husb_edu*rel2
gen hhdage_rel1 = hhd_age*rel1
gen hhdage_rel2 = hhd_age*rel2
gen cas1_rel1 =  cas1*rel1
gen cas1_rel2 =  cas1*rel2
gen cas2_rel1 =  cas2*rel1
gen cas2_rel2 =  cas2*rel2
gen rel1_yea =  rel1*year_marr
gen rel2_yea =  rel2*year_marr
gen rel1_pla1 =  rel1*pla1
gen rel1_pla2 =  rel1*pla2
gen rel2_pla1 =  rel2*pla1
gen rel2_pla2 =  rel2*pla2



gen hdgen1_wi1 = hdgen1*wi1
gen hdgen1_wi2 = hdgen1*wi2
gen hdgen1_wi3 = hdgen1*wi3
gen hdgen1_wi4 = hdgen1*wi4
gen hdgen1_wi5 = hdgen1*wi5
gen hdgen2_wi1 = hdgen2*wi1
gen hdgen2_wi2 = hdgen2*wi2
gen hdgen2_wi3 = hdgen2*wi3
gen hdgen2_wi4 = hdgen2*wi4
gen hdgen2_wi5 = hdgen2*wi5
gen wi1_siz = wi1*hh_size
gen wi2_siz = wi2*hh_size
gen wi3_siz = wi3*hh_size
gen wi4_siz = wi4*hh_size
gen wi5_siz = wi5*hh_size
gen hedu_wi1 = husb_edu*wi1
gen hedu_wi2 = husb_edu*wi2
gen hedu_wi3 = husb_edu*wi3
gen hedu_wi4 = husb_edu*wi4
gen hedu_wi5 = husb_edu*wi5
gen hhdage_wi1 = hhd_age*wi1
gen hhdage_wi2 = hhd_age*wi2
gen hhdage_wi3 = hhd_age*wi3
gen hhdage_wi4 = hhd_age*wi4
gen hhdage_wi5 = hhd_age*wi5
gen cas1_wi1 = cas1*wi1
gen cas1_wi2 = cas1*wi2
gen cas1_wi3 = cas1*wi3
gen cas1_wi4 = cas1*wi4
gen cas1_wi5 = cas1*wi5
gen cas2_wi1 = cas2*wi1
gen cas2_wi2 = cas2*wi2
gen cas2_wi3 = cas2*wi3                       
gen cas2_wi4 = cas2*wi4
gen cas2_wi5 = cas2*wi5
gen wi1_yea = wi1*year_marr 
gen wi2_yea = wi2*year_marr 
gen wi3_yea = wi3*year_marr 
gen wi4_yea = wi4*year_marr 
gen wi5_yea = wi5*year_marr 
gen wi1_pla1 = wi1*pla1
gen wi1_pla2 = wi1*pla2
gen wi2_pla1 = wi2*pla1
gen wi2_pla2 = wi2*pla2
gen wi3_pla1 = wi3*pla1
gen wi3_pla2 = wi3*pla2
gen wi4_pla1 = wi4*pla1
gen wi4_pla2 = wi4*pla2
gen wi5_pla1 = wi5*pla1
gen wi5_pla2 = wi5*pla2


gen hdgen1_siz =  hdgen1*hh_size
gen hdgen2_siz =  hdgen2*hh_size
gen hdgen1_hedu = hdgen1*husb_edu
gen hdgen2_hedu = hdgen2*husb_edu
gen hdgen1_hhdage= hdgen1*hhd_age
gen hdgen2_hhdage= hdgen2*hhd_age
gen cas1_hdgen1 =  cas1*hdgen1
gen cas1_hdgen2 =  cas1*hdgen2
gen cas2_hdgen1 =  cas2*hdgen1
gen cas2_hdgen2 =  cas2*hdgen2
gen hdgen1_yea = hdgen1*year_marr 
gen hdgen2_yea = hdgen2*year_marr 
gen hdgen1_pla1 =  hdgen1*pla1
gen hdgen1_pla2 =  hdgen1*pla2
gen hdgen2_pla1 =  hdgen2*pla1
gen hdgen2_pla2 =  hdgen2*pla2




gen siz_siz =  hh_size*hh_size 
gen hedu_siz = husb_edu*hh_size
gen hhdage_size = hhd_age*hh_size
gen cas1_siz =  cas1*hh_size 
gen cas2_siz =  cas2*hh_size 
gen siz_yea =  hh_size*year_marr 
gen siz_pla1 =  hh_size*pla1
gen siz_pla2 =  hh_size*pla2


gen hedu_hedu = husb_edu*husb_edu
gen hedu_hhdage = husb_edu*hhd_age
gen hedu_cas1 = husb_edu*cas1
gen hedu_cas2 = husb_edu*cas2
gen hedu_yea = husb_edu*year_marr
gen hedu_pla1 = husb_edu*pla1
gen hedu_pla2 = husb_edu*pla2


gen hhdage_hhdage = hhd_age*hhd_age
gen hhdage_cas1 = hhd_age*cas1 
gen hhdage_cas2 = hhd_age*cas2
gen hhdage_yea = hhd_age*year_marr
gen hhdage_pla1 = hhd_age*pla1
gen hhdage_pla2 = hhd_age*pla2


gen cas1_yea =  cas1*year_marr 
gen cas2_yea =  cas2*year_marr 
gen cas1_pla1 = cas1*pla1
gen cas1_pla2 = cas1*pla2
gen cas2_pla1 = cas2*pla1
gen cas2_pla2 = cas2*pla2


gen yea_yea =  year_marr*year_marr
gen pla1_yea =  pla1*year_marr
gen pla2_yea =  pla2*year_marr

* now among 11 linear and 25 interaction terms selected from the baseline datasets
* we check for multicollinearity among them in the endline dataset
* below are the 36 linear, quadratic and interaction terms selected from the baseline model
* resp_age resp_edu religion wealth_index hhd_gender hh_size husb_edu hhd_age caste year_marr place_residence cas1_rel1 hedu_hedu hdgen1_hhdage yea_yea edu_yea hedu_edu rel1_wi1 rel1_wi2 rel1_wi3 rel1_wi4 wi1_age wi2_age wi3_age wi4_age  hedu_wi1 hedu_wi2  hedu_wi3 hedu_wi4 edu_edu  cas1_yea cas1_age edu_pla1 hedu_cas1  cas1_hdgen1 hdgen1_siz 



* we run a simple regression first on ipv_index_norm in the endline data 
quiet regress ipv_index_norm resp_age resp_edu religion wealth_index hhd_gender hh_size husb_edu hhd_age caste year_marr place_residence cas1_rel1 hedu_hedu hdgen1_hhdage yea_yea edu_yea hedu_edu rel1_wi1 rel1_wi2 rel1_wi3 rel1_wi4 wi1_age wi2_age wi3_age wi4_age  hedu_wi1 hedu_wi2  hedu_wi3 hedu_wi4 edu_edu  cas1_yea cas1_age edu_pla1 hedu_cas1  cas1_hdgen1 hdgen1_siz



* we then type vif that shows the value of multicollinarity among these variables . We remove the non linear term with highest VIF value in each step below .

. vif

    Variable |       VIF       1/VIF  
-------------+----------------------
    cas1_age |    131.65    0.007596
     wi1_age |     97.56    0.010250
    husb_edu |     69.01    0.014490
     wi2_age |     55.88    0.017895
       caste |     49.90    0.020042
    cas1_yea |     40.92    0.024440
   year_marr |     38.29    0.026119
    resp_age |     37.19    0.026887
wealth_index |     36.32    0.027531
    resp_edu |     31.89    0.031356
    religion |     30.59    0.032687
     wi3_age |     30.49    0.032794
    hedu_wi1 |     26.06    0.038370
    hedu_wi2 |     23.82    0.041975
    hedu_edu |     21.22    0.047122
     wi4_age |     20.61    0.048527
    hedu_wi3 |     19.69    0.050799
   hedu_hedu |     19.39    0.051573
hdgen1_hhd~e |     19.03    0.052542
     edu_edu |     18.41    0.054325
    hedu_wi4 |     17.73    0.056391
    rel1_wi1 |     16.88    0.059230
     yea_yea |     16.05    0.062324
  hdgen1_siz |     14.88    0.067191
  hhd_gender |     14.80    0.067559
    rel1_wi2 |      8.62    0.115984
    edu_pla1 |      7.64    0.130891
 cas1_hdgen1 |      7.01    0.142733
     hh_size |      6.64    0.150505
     hhd_age |      6.13    0.163148
   hedu_cas1 |      5.76    0.173722
     edu_yea |      5.34    0.187210
    rel1_wi3 |      5.27    0.189715
    rel1_wi4 |      3.74    0.267633
   cas1_rel1 |      3.68    0.271597
place_resi~e |      3.08    0.324801
-------------+----------------------
    Mean VIF |     26.70

. 
* remove cas1_age

quiet regress ipv_index_norm resp_age resp_edu religion wealth_index hhd_gender hh_size husb_edu hhd_age caste year_marr place_residence cas1_rel1 hedu_hedu hdgen1_hhdage yea_yea edu_yea hedu_edu rel1_wi1 rel1_wi2 rel1_wi3 rel1_wi4 wi1_age wi2_age wi3_age wi4_age  hedu_wi1 hedu_wi2  hedu_wi3 hedu_wi4 edu_edu  cas1_yea  edu_pla1 hedu_cas1  cas1_hdgen1 hdgen1_siz


. vif

    Variable |       VIF       1/VIF  
-------------+----------------------
     wi1_age |     96.67    0.010345
    husb_edu |     68.96    0.014502
     wi2_age |     55.62    0.017981
wealth_index |     35.87    0.027879
    resp_edu |     31.89    0.031362
    religion |     30.58    0.032701
     wi3_age |     30.46    0.032832
   year_marr |     30.31    0.032992
    hedu_wi1 |     26.01    0.038439
    resp_age |     24.58    0.040676
    hedu_wi2 |     23.75    0.042098
    hedu_edu |     21.22    0.047133
     wi4_age |     20.60    0.048536
    hedu_wi3 |     19.64    0.050929
   hedu_hedu |     19.39    0.051583
hdgen1_hhd~e |     19.03    0.052543
     edu_edu |     18.41    0.054328
    hedu_wi4 |     17.73    0.056414
    rel1_wi1 |     16.87    0.059278
     yea_yea |     16.03    0.062376
  hdgen1_siz |     14.86    0.067280
  hhd_gender |     14.80    0.067563
       caste |     10.81    0.092481
    rel1_wi2 |      8.62    0.116045
    edu_pla1 |      7.64    0.130904
 cas1_hdgen1 |      6.99    0.142973
     hh_size |      6.64    0.150668
    cas1_yea |      6.63    0.150904
     hhd_age |      6.13    0.163148
   hedu_cas1 |      5.72    0.174743
     edu_yea |      5.32    0.187882
    rel1_wi3 |      5.27    0.189796
    rel1_wi4 |      3.74    0.267719
   cas1_rel1 |      3.63    0.275212
place_resi~e |      3.08    0.324824
-------------+----------------------
    Mean VIF |     20.96

. 
* remove wi1_age 

quiet regress ipv_index_norm resp_age resp_edu religion wealth_index hhd_gender hh_size husb_edu hhd_age caste year_marr place_residence cas1_rel1 hedu_hedu hdgen1_hhdage yea_yea edu_yea hedu_edu rel1_wi1 rel1_wi2 rel1_wi3 rel1_wi4  wi2_age wi3_age wi4_age  hedu_wi1 hedu_wi2  hedu_wi3 hedu_wi4 edu_edu  cas1_yea  edu_pla1 hedu_cas1  cas1_hdgen1 hdgen1_siz


. vif

    Variable |       VIF       1/VIF  
-------------+----------------------
    husb_edu |     64.65    0.015468
    resp_edu |     31.37    0.031880
    religion |     30.55    0.032730
   year_marr |     29.12    0.034337
    hedu_wi1 |     23.84    0.041938
    hedu_edu |     21.12    0.047345
    hedu_wi2 |     21.02    0.047565
wealth_index |     20.35    0.049150
   hedu_hedu |     19.33    0.051725
hdgen1_hhd~e |     19.02    0.052569
     edu_edu |     18.39    0.054366
    rel1_wi1 |     16.85    0.059354
    hedu_wi3 |     16.37    0.061083
     yea_yea |     15.80    0.063287
  hdgen1_siz |     14.86    0.067284
  hhd_gender |     14.79    0.067626
    hedu_wi4 |     14.03    0.071283
     wi4_age |     10.91    0.091626
       caste |     10.66    0.093771
    rel1_wi2 |      8.60    0.116231
    resp_age |      8.50    0.117653
     wi3_age |      7.83    0.127638
    edu_pla1 |      7.64    0.130948
 cas1_hdgen1 |      6.99    0.143030
     hh_size |      6.63    0.150738
    cas1_yea |      6.45    0.154975
     hhd_age |      6.13    0.163213
   hedu_cas1 |      5.72    0.174892
    rel1_wi3 |      5.25    0.190543
     wi2_age |      4.64    0.215313
     edu_yea |      4.45    0.224689
    rel1_wi4 |      3.72    0.268732
   cas1_rel1 |      3.63    0.275541
place_resi~e |      3.07    0.325283
-------------+----------------------
    Mean VIF |     14.77

. 

* remove hedu_wi1

quiet regress ipv_index_norm resp_age resp_edu religion wealth_index hhd_gender hh_size husb_edu hhd_age caste year_marr place_residence cas1_rel1 hedu_hedu hdgen1_hhdage yea_yea edu_yea hedu_edu rel1_wi1 rel1_wi2 rel1_wi3 rel1_wi4  wi2_age wi3_age wi4_age   hedu_wi2  hedu_wi3 hedu_wi4 edu_edu  cas1_yea  edu_pla1 hedu_cas1  cas1_hdgen1 hdgen1_siz


. vif

    Variable |       VIF       1/VIF  
-------------+----------------------
    resp_edu |     31.07    0.032182
    religion |     30.51    0.032780
   year_marr |     28.88    0.034621
    hedu_edu |     20.74    0.048226
hdgen1_hhd~e |     19.02    0.052570
     edu_edu |     18.36    0.054474
   hedu_hedu |     18.23    0.054840
    rel1_wi1 |     16.82    0.059450
     yea_yea |     15.79    0.063348
  hdgen1_siz |     14.86    0.067285
  hhd_gender |     14.79    0.067626
    husb_edu |     13.01    0.076884
       caste |     10.54    0.094832
    rel1_wi2 |      8.60    0.116282
    resp_age |      8.24    0.121404
     wi4_age |      7.90    0.126584
    edu_pla1 |      7.51    0.133116
    hedu_wi4 |      7.14    0.140008
 cas1_hdgen1 |      6.99    0.143031
     hh_size |      6.63    0.150745
    cas1_yea |      6.40    0.156359
     hhd_age |      6.13    0.163213
   hedu_cas1 |      5.62    0.178032
    rel1_wi3 |      5.25    0.190549
     wi3_age |      4.99    0.200284
     edu_yea |      4.33    0.230923
    hedu_wi3 |      4.25    0.235524
wealth_index |      4.01    0.249458
    rel1_wi4 |      3.72    0.268777
   cas1_rel1 |      3.62    0.276206
     wi2_age |      3.41    0.293010
    hedu_wi2 |      3.24    0.308375
place_resi~e |      3.05    0.328091
-------------+----------------------
    Mean VIF |     11.02

. 
* remove hedu_edu

quiet regress ipv_index_norm resp_age resp_edu religion wealth_index hhd_gender hh_size husb_edu hhd_age caste year_marr place_residence cas1_rel1 hedu_hedu hdgen1_hhdage yea_yea edu_yea  rel1_wi1 rel1_wi2 rel1_wi3 rel1_wi4  wi2_age wi3_age wi4_age   hedu_wi2  hedu_wi3 hedu_wi4 edu_edu  cas1_yea  edu_pla1 hedu_cas1  cas1_hdgen1 hdgen1_siz

. vif

    Variable |       VIF       1/VIF  
-------------+----------------------
    religion |     30.50    0.032785
    resp_edu |     30.31    0.032996
   year_marr |     28.85    0.034658
hdgen1_hhd~e |     19.02    0.052578
    rel1_wi1 |     16.82    0.059465
     yea_yea |     15.78    0.063362
  hdgen1_siz |     14.86    0.067290
  hhd_gender |     14.79    0.067628
     edu_edu |     13.95    0.071699
   hedu_hedu |     12.59    0.079428
    husb_edu |     12.27    0.081491
       caste |     10.53    0.094998
    rel1_wi2 |      8.60    0.116284
    resp_age |      8.22    0.121626
     wi4_age |      7.78    0.128494
    edu_pla1 |      7.51    0.133124
    hedu_wi4 |      7.04    0.142039
 cas1_hdgen1 |      6.99    0.143038
     hh_size |      6.63    0.150747
    cas1_yea |      6.39    0.156391
     hhd_age |      6.13    0.163225
   hedu_cas1 |      5.60    0.178414
    rel1_wi3 |      5.25    0.190570
     wi3_age |      4.96    0.201537
     edu_yea |      4.31    0.231833
    hedu_wi3 |      4.23    0.236487
wealth_index |      3.99    0.250621
    rel1_wi4 |      3.72    0.268778
   cas1_rel1 |      3.62    0.276311
     wi2_age |      3.40    0.294462
    hedu_wi2 |      3.24    0.308966
place_resi~e |      3.05    0.328094
-------------+----------------------
    Mean VIF |     10.34

. 
* remove hdgen1_hhdage

quiet regress ipv_index_norm resp_age resp_edu religion wealth_index hhd_gender hh_size husb_edu hhd_age caste year_marr place_residence cas1_rel1 hedu_hedu  yea_yea edu_yea  rel1_wi1 rel1_wi2 rel1_wi3 rel1_wi4  wi2_age wi3_age wi4_age   hedu_wi2  hedu_wi3 hedu_wi4 edu_edu  cas1_yea  edu_pla1 hedu_cas1  cas1_hdgen1 hdgen1_siz


. vif

    Variable |       VIF       1/VIF  
-------------+----------------------
    religion |     30.50    0.032785
    resp_edu |     30.29    0.033010
   year_marr |     28.85    0.034663
    rel1_wi1 |     16.82    0.059465
     yea_yea |     15.78    0.063363
     edu_edu |     13.94    0.071741
  hdgen1_siz |     13.37    0.074776
   hedu_hedu |     12.59    0.079433
    husb_edu |     12.27    0.081505
       caste |     10.52    0.095102
  hhd_gender |      8.80    0.113581
    rel1_wi2 |      8.60    0.116289
    resp_age |      8.22    0.121648
     wi4_age |      7.78    0.128495
    edu_pla1 |      7.51    0.133149
    hedu_wi4 |      7.04    0.142040
 cas1_hdgen1 |      6.99    0.143071
    cas1_yea |      6.39    0.156582
     hh_size |      6.08    0.164452
   hedu_cas1 |      5.60    0.178481
    rel1_wi3 |      5.25    0.190572
     wi3_age |      4.96    0.201566
     edu_yea |      4.31    0.232247
    hedu_wi3 |      4.23    0.236527
wealth_index |      3.99    0.250655
    rel1_wi4 |      3.72    0.268781
   cas1_rel1 |      3.62    0.276333
     wi2_age |      3.40    0.294462
    hedu_wi2 |      3.24    0.308966
place_resi~e |      3.04    0.328477
     hhd_age |      1.33    0.750343
-------------+----------------------
    Mean VIF |      9.65

. 
* remove rel1_wi1

quiet regress ipv_index_norm resp_age resp_edu religion wealth_index hhd_gender hh_size husb_edu hhd_age caste year_marr place_residence cas1_rel1 hedu_hedu  yea_yea edu_yea   rel1_wi2 rel1_wi3 rel1_wi4  wi2_age wi3_age wi4_age   hedu_wi2  hedu_wi3 hedu_wi4 edu_edu  cas1_yea  edu_pla1 hedu_cas1  cas1_hdgen1 hdgen1_siz


. vif

    Variable |       VIF       1/VIF  
-------------+----------------------
    resp_edu |     30.29    0.033011
   year_marr |     28.84    0.034671
     yea_yea |     15.78    0.063364
     edu_edu |     13.93    0.071800
  hdgen1_siz |     13.37    0.074781
   hedu_hedu |     12.59    0.079442
    husb_edu |     12.27    0.081515
       caste |     10.49    0.095291
  hhd_gender |      8.80    0.113582
    resp_age |      8.20    0.121901
     wi4_age |      7.73    0.129289
    edu_pla1 |      7.51    0.133181
    hedu_wi4 |      7.02    0.142487
 cas1_hdgen1 |      6.99    0.143071
    cas1_yea |      6.38    0.156642
     hh_size |      6.08    0.164486
   hedu_cas1 |      5.60    0.178624
     wi3_age |      4.93    0.203043
     edu_yea |      4.31    0.232255
    hedu_wi3 |      4.22    0.236918
wealth_index |      3.69    0.271013
   cas1_rel1 |      3.52    0.283990
     wi2_age |      3.39    0.295166
    hedu_wi2 |      3.24    0.309079
    religion |      3.22    0.310456
place_resi~e |      3.04    0.328478
    rel1_wi2 |      1.72    0.580892
    rel1_wi3 |      1.56    0.641672
    rel1_wi4 |      1.42    0.704167
     hhd_age |      1.33    0.750363
-------------+----------------------
    Mean VIF |      8.05

. 
* remove yea_yea

quiet regress ipv_index_norm resp_age resp_edu religion wealth_index hhd_gender hh_size husb_edu hhd_age caste year_marr place_residence cas1_rel1 hedu_hedu   edu_yea   rel1_wi2 rel1_wi3 rel1_wi4  wi2_age wi3_age wi4_age   hedu_wi2  hedu_wi3 hedu_wi4 edu_edu  cas1_yea  edu_pla1 hedu_cas1  cas1_hdgen1 hdgen1_siz



. vif

    Variable |       VIF       1/VIF  
-------------+----------------------
    resp_edu |     29.30    0.034135
     edu_edu |     13.86    0.072174
  hdgen1_siz |     13.37    0.074787
   hedu_hedu |     12.55    0.079692
    husb_edu |     12.25    0.081656
   year_marr |     10.64    0.093947
       caste |     10.45    0.095653
  hhd_gender |      8.80    0.113582
    resp_age |      8.16    0.122563
     wi4_age |      7.69    0.130051
    edu_pla1 |      7.51    0.133208
 cas1_hdgen1 |      6.99    0.143080
    hedu_wi4 |      6.99    0.143090
    cas1_yea |      6.34    0.157609
     hh_size |      6.04    0.165440
   hedu_cas1 |      5.60    0.178713
     wi3_age |      4.91    0.203515
    hedu_wi3 |      4.21    0.237638
     edu_yea |      3.71    0.269418
wealth_index |      3.69    0.271064
   cas1_rel1 |      3.52    0.284340
     wi2_age |      3.38    0.295549
    hedu_wi2 |      3.23    0.309580
    religion |      3.22    0.310520
place_resi~e |      3.04    0.328485
    rel1_wi2 |      1.72    0.580977
    rel1_wi3 |      1.56    0.642021
    rel1_wi4 |      1.42    0.704623
     hhd_age |      1.29    0.774026
-------------+----------------------
    Mean VIF |      7.08

. 
* remove  edu_edu
quiet regress ipv_index_norm resp_age resp_edu religion wealth_index hhd_gender hh_size husb_edu hhd_age caste year_marr place_residence cas1_rel1 hedu_hedu   edu_yea   rel1_wi2 rel1_wi3 rel1_wi4  wi2_age wi3_age wi4_age   hedu_wi2  hedu_wi3 hedu_wi4  cas1_yea  edu_pla1 hedu_cas1  cas1_hdgen1 hdgen1_siz

. vif

    Variable |       VIF       1/VIF  
-------------+----------------------
  hdgen1_siz |     13.37    0.074787
   hedu_hedu |     11.56    0.086523
    husb_edu |     11.42    0.087587
    resp_edu |     11.40    0.087754
   year_marr |     10.64    0.093958
       caste |     10.45    0.095668
  hhd_gender |      8.80    0.113584
    resp_age |      8.04    0.124375
     wi4_age |      7.66    0.130474
    edu_pla1 |      7.24    0.138041
 cas1_hdgen1 |      6.99    0.143083
    hedu_wi4 |      6.98    0.143186
    cas1_yea |      6.34    0.157609
     hh_size |      6.04    0.165454
   hedu_cas1 |      5.59    0.178911
     wi3_age |      4.89    0.204350
    hedu_wi3 |      4.21    0.237651
wealth_index |      3.64    0.275031
   cas1_rel1 |      3.52    0.284382
     edu_yea |      3.44    0.290314
     wi2_age |      3.37    0.296422
    hedu_wi2 |      3.23    0.309580
    religion |      3.22    0.310529
place_resi~e |      2.99    0.334593
    rel1_wi2 |      1.72    0.581009
    rel1_wi3 |      1.56    0.642048
    rel1_wi4 |      1.42    0.704909
     hhd_age |      1.29    0.774234
-------------+----------------------
    Mean VIF |      6.11

. 
* remove hdgen1_siz
quiet regress ipv_index_norm resp_age resp_edu religion wealth_index hhd_gender hh_size husb_edu hhd_age caste year_marr place_residence cas1_rel1 hedu_hedu   edu_yea   rel1_wi2 rel1_wi3 rel1_wi4  wi2_age wi3_age wi4_age   hedu_wi2  hedu_wi3 hedu_wi4  cas1_yea  edu_pla1 hedu_cas1  cas1_hdgen1 



    Variable |       VIF       1/VIF  
-------------+----------------------
   hedu_hedu |     11.55    0.086563
    husb_edu |     11.41    0.087604
    resp_edu |     11.39    0.087759
   year_marr |     10.64    0.093965
       caste |     10.45    0.095725
    resp_age |      8.04    0.124426
     wi4_age |      7.66    0.130547
    edu_pla1 |      7.24    0.138044
    hedu_wi4 |      6.98    0.143213
 cas1_hdgen1 |      6.98    0.143225
    cas1_yea |      6.34    0.157630
   hedu_cas1 |      5.59    0.178913
     wi3_age |      4.89    0.204382
    hedu_wi3 |      4.21    0.237660
wealth_index |      3.63    0.275311
   cas1_rel1 |      3.52    0.284403
     edu_yea |      3.44    0.290343
     wi2_age |      3.37    0.296484
    hedu_wi2 |      3.23    0.309583
    religion |      3.22    0.310545
place_resi~e |      2.99    0.334636
  hhd_gender |      2.94    0.339619
    rel1_wi2 |      1.72    0.581039
    rel1_wi3 |      1.56    0.642259
    rel1_wi4 |      1.42    0.705152
     hhd_age |      1.29    0.774422
     hh_size |      1.20    0.832302
-------------+----------------------
    Mean VIF |      5.44

. 
* remove hedu_hedu 

quiet regress ipv_index_norm resp_age resp_edu religion wealth_index hhd_gender hh_size husb_edu hhd_age caste year_marr place_residence cas1_rel1  edu_yea   rel1_wi2 rel1_wi3 rel1_wi4  wi2_age wi3_age wi4_age   hedu_wi2  hedu_wi3 hedu_wi4  cas1_yea  edu_pla1 hedu_cas1  cas1_hdgen1 


. vif

    Variable |       VIF       1/VIF  
-------------+----------------------
    resp_edu |     11.27    0.088733
   year_marr |     10.60    0.094304
       caste |     10.38    0.096304
    resp_age |      7.90    0.126532
    edu_pla1 |      7.23    0.138362
     wi4_age |      7.09    0.140987
 cas1_hdgen1 |      6.98    0.143232
    hedu_wi4 |      6.44    0.155367
    cas1_yea |      6.34    0.157829
   hedu_cas1 |      5.55    0.180155
     wi3_age |      4.67    0.214052
    husb_edu |      4.42    0.226336
    hedu_wi3 |      4.05    0.247140
wealth_index |      3.51    0.284746
   cas1_rel1 |      3.50    0.285502
     edu_yea |      3.44    0.290399
     wi2_age |      3.32    0.301368
    hedu_wi2 |      3.22    0.310790
    religion |      3.22    0.311027
place_resi~e |      2.98    0.335044
  hhd_gender |      2.94    0.339625
    rel1_wi2 |      1.72    0.581164
    rel1_wi3 |      1.56    0.642355
    rel1_wi4 |      1.42    0.705597
     hhd_age |      1.29    0.775893
     hh_size |      1.20    0.832410
-------------+----------------------
    Mean VIF |      4.86

. 
* remove edu_pla1 (while edu_pla1 has a value less than 10 but it is related to resp_edu which is above 10)

quiet regress ipv_index_norm resp_age resp_edu religion wealth_index hhd_gender hh_size husb_edu hhd_age caste year_marr place_residence cas1_rel1  edu_yea   rel1_wi2 rel1_wi3 rel1_wi4  wi2_age wi3_age wi4_age   hedu_wi2  hedu_wi3 hedu_wi4  cas1_yea   hedu_cas1  cas1_hdgen1 

. vif

    Variable |       VIF       1/VIF  
-------------+----------------------
   year_marr |     10.58    0.094513
       caste |     10.38    0.096323
    resp_age |      7.85    0.127463
 cas1_hdgen1 |      6.98    0.143265
     wi4_age |      6.83    0.146388
    hedu_wi4 |      6.35    0.157437
    cas1_yea |      6.34    0.157840
   hedu_cas1 |      5.55    0.180177
    resp_edu |      4.67    0.214009
     wi3_age |      4.54    0.220031
    husb_edu |      4.42    0.226344
    hedu_wi3 |      4.05    0.247140
   cas1_rel1 |      3.50    0.285527
     edu_yea |      3.43    0.291185
wealth_index |      3.32    0.300967
     wi2_age |      3.28    0.305013
    religion |      3.21    0.311087
    hedu_wi2 |      3.21    0.311232
  hhd_gender |      2.94    0.339912
    rel1_wi2 |      1.72    0.581241
    rel1_wi3 |      1.56    0.643057
    rel1_wi4 |      1.42    0.706551
place_resi~e |      1.34    0.745989
     hhd_age |      1.29    0.775906
     hh_size |      1.20    0.833336
-------------+----------------------
    Mean VIF |      4.40

. 
* remove cas1_hdgen1 ( value less than 10 but related to caste)

quiet regress ipv_index_norm resp_age resp_edu religion wealth_index hhd_gender hh_size husb_edu hhd_age caste year_marr place_residence cas1_rel1  edu_yea   rel1_wi2 rel1_wi3 rel1_wi4  wi2_age wi3_age wi4_age   hedu_wi2  hedu_wi3 hedu_wi4  cas1_yea   hedu_cas1  

. vif

    Variable |       VIF       1/VIF  
-------------+----------------------
   year_marr |     10.58    0.094562
    resp_age |      7.83    0.127651
     wi4_age |      6.83    0.146429
       caste |      6.81    0.146825
    hedu_wi4 |      6.35    0.157471
    cas1_yea |      6.33    0.157975
   hedu_cas1 |      5.53    0.180713
    resp_edu |      4.67    0.214103
     wi3_age |      4.54    0.220033
    husb_edu |      4.41    0.226966
    hedu_wi3 |      4.05    0.247141
   cas1_rel1 |      3.50    0.285530
     edu_yea |      3.43    0.291249
wealth_index |      3.32    0.301305
     wi2_age |      3.28    0.305018
    hedu_wi2 |      3.21    0.311236
    religion |      3.21    0.311358
    rel1_wi2 |      1.72    0.581253
    rel1_wi3 |      1.55    0.643088
    rel1_wi4 |      1.41    0.707447
place_resi~e |      1.34    0.746016
     hhd_age |      1.29    0.777493
     hh_size |      1.20    0.833635
  hhd_gender |      1.07    0.934253
-------------+----------------------
    Mean VIF |      4.06

. 
* remove cas1_yea (since year_marr is above 10 and cas1_yea is the first non linear term in the list associated with year_marr. first non linear term in the list wi4_age is not related to year_marr and also has a value of 6.35 and is not a good option to remove )

quiet regress ipv_index_norm resp_age resp_edu religion wealth_index hhd_gender hh_size husb_edu hhd_age caste year_marr place_residence cas1_rel1  edu_yea   rel1_wi2 rel1_wi3 rel1_wi4  wi2_age wi3_age wi4_age   hedu_wi2  hedu_wi3 hedu_wi4     hedu_cas1


. vif

    Variable |       VIF       1/VIF  
-------------+----------------------
   year_marr |      8.93    0.111951
    resp_age |      7.83    0.127661
     wi4_age |      6.79    0.147350
    hedu_wi4 |      6.32    0.158165
   hedu_cas1 |      5.34    0.187380
    resp_edu |      4.59    0.217693
     wi3_age |      4.52    0.221298
    husb_edu |      4.26    0.234480
    hedu_wi3 |      4.03    0.248172
   cas1_rel1 |      3.50    0.285844
     edu_yea |      3.35    0.298793
wealth_index |      3.31    0.301706
     wi2_age |      3.27    0.306089
    religion |      3.21    0.311358
    hedu_wi2 |      3.20    0.312199
       caste |      2.91    0.343089
    rel1_wi2 |      1.72    0.581255
    rel1_wi3 |      1.55    0.643094
    rel1_wi4 |      1.41    0.707449
place_resi~e |      1.34    0.746512
     hhd_age |      1.29    0.777496
     hh_size |      1.20    0.835459
  hhd_gender |      1.07    0.934511
-------------+----------------------
    Mean VIF |      3.69

. 


* Now all VIF values are below 10. (23 terms)
* we have 11 linear and 12 non linear terms 

* resp_age  resp_edu  year_marr  caste  husb_edu wealth_index  religion hhd_gender place_residence hhd_age hh_size  wi4_age  wi3_age  wi2_age hedu_cas1 hedu_wi4  hedu_wi3 hedu_wi2 cas1_rel1  edu_yea rel1_wi2 rel1_wi3 rel1_wi4 
 
 
 * wealth index is also a categorical variable. we therefore replace it with dummies here 
 
resp_age  resp_edu  year_marr  caste  husb_edu richest richer middle poorer poorest  religion hhd_gender place_residence hhd_age hh_size  wi4_age  wi3_age  wi2_age hedu_cas1 hedu_wi4  hedu_wi3 hedu_wi2 cas1_rel1  edu_yea rel1_wi2 rel1_wi3 rel1_wi4
 

* in this list of control we add wi1_age, hedu_wi1, rel1_wi1 to keep all categories within the interaction terms. ( we do not include wi5 cases to avoid dummy trap)
* we also try to keep all interaction cases in cas*rel case. However three interaction terms get dropped if we control for all 4 interactions. We therefore include only cas2*rel2 to provide estimates for Hindu*SC/ST observations.  


* so we have 14 linear controls and 15 non linear controls for our further psm did analysis 



* PSM DID 
* we open the master file. Remember that baseline and endline datasets were created only to identfy controls 
* we first add all quadratic and interaction terms in this data as it was done separately for baseline and endline datasets 
* we create all non linear terms as before but will use only those that have been identified in line 2085. 


* open master dataset 

* generate all interaction and quadratic terms 

* we first create dummies for each categorical variable 

tabulate religion, generate(rel)
tabulate wealth_index, generate(wi)
tabulate hhd_gender, generate(hdgen)
tabulate caste, generate(cas)
tabulate place_residence, generate(pla)

* now we create interaction terms 
 
gen age_age = resp_age*resp_age
gen edu_age = resp_edu*resp_age
gen rel1_age =  (rel1)*resp_age
gen rel2_age =  (rel2)*resp_age
gen wi1_age = (wi1)*resp_age 
gen wi2_age = (wi2)*resp_age 
gen wi3_age = (wi3)*resp_age 
gen wi4_age = (wi4)*resp_age 
gen wi5_age = (wi5)*resp_age 
gen hdgen1_age = hdgen1*resp_age
gen hdgen2_age = hdgen2*resp_age
gen siz_age =  hh_size*resp_age 
gen hedu_age = husb_edu*resp_age
gen hhdage_age = hhd_age*resp_age
gen cas1_age =  cas1*resp_age
gen cas2_age =  cas2*resp_age
gen yea_age =  year_marr*resp_age
gen pla1_age =  pla1*resp_age
gen pla2_age =  pla2*resp_age

gen edu_edu =  resp_edu*resp_edu 
gen rel1_edu =  rel1*resp_edu
gen rel2_edu =  rel2*resp_edu
gen wi1_edu = wi1*resp_edu
gen wi2_edu = wi2*resp_edu
gen wi3_edu = wi3*resp_edu
gen wi4_edu = wi4*resp_edu
gen wi5_edu = wi5*resp_edu
gen hdgen1_edu = hdgen1*resp_edu
gen hdgen2_edu = hdgen2*resp_edu
gen edu_siz =  resp_edu*hh_size 
gen hedu_edu = husb_edu*resp_edu
gen hhdage_edu = hhd_age*resp_edu
gen cas1_edu = cas1*resp_edu
gen cas2_edu = cas2*resp_edu
gen edu_yea = resp_edu*year_marr 
gen edu_pla1 = resp_edu*pla1
gen edu_pla2 = resp_edu*pla2


gen rel1_wi1 =  rel1*wi1
gen rel1_wi2 =  rel1*wi2
gen rel1_wi3 =  rel1*wi3
gen rel1_wi4 =  rel1*wi4
gen rel1_wi5 =  rel1*wi5
gen rel2_wi1 =  rel2*wi1
gen rel2_wi2 =  rel2*wi2
gen rel2_wi3 =  rel2*wi3
gen rel2_wi4 =  rel2*wi4
gen rel2_wi5 =  rel2*wi5
gen hdgen1_rel1 =  hdgen1*rel1
gen hdgen1_rel2 =  hdgen1*rel2
gen hdgen2_rel1 =  hdgen2*rel1
gen hdgen2_rel2 =  hdgen2*rel2
gen rel1_siz =  rel1*hh_size
gen rel2_siz =  rel2*hh_size
gen hedu_rel1 = husb_edu*rel1
gen hedu_rel2 = husb_edu*rel2
gen hhdage_rel1 = hhd_age*rel1
gen hhdage_rel2 = hhd_age*rel2
gen cas1_rel1 =  cas1*rel1
gen cas1_rel2 =  cas1*rel2
gen cas2_rel1 =  cas2*rel1
gen cas2_rel2 =  cas2*rel2
gen rel1_yea =  rel1*year_marr
gen rel2_yea =  rel2*year_marr
gen rel1_pla1 =  rel1*pla1
gen rel1_pla2 =  rel1*pla2
gen rel2_pla1 =  rel2*pla1
gen rel2_pla2 =  rel2*pla2



gen hdgen1_wi1 = hdgen1*wi1
gen hdgen1_wi2 = hdgen1*wi2
gen hdgen1_wi3 = hdgen1*wi3
gen hdgen1_wi4 = hdgen1*wi4
gen hdgen1_wi5 = hdgen1*wi5
gen hdgen2_wi1 = hdgen2*wi1
gen hdgen2_wi2 = hdgen2*wi2
gen hdgen2_wi3 = hdgen2*wi3
gen hdgen2_wi4 = hdgen2*wi4
gen hdgen2_wi5 = hdgen2*wi5
gen wi1_siz = wi1*hh_size
gen wi2_siz = wi2*hh_size
gen wi3_siz = wi3*hh_size
gen wi4_siz = wi4*hh_size
gen wi5_siz = wi5*hh_size
gen hedu_wi1 = husb_edu*wi1
gen hedu_wi2 = husb_edu*wi2
gen hedu_wi3 = husb_edu*wi3
gen hedu_wi4 = husb_edu*wi4
gen hedu_wi5 = husb_edu*wi5
gen hhdage_wi1 = hhd_age*wi1
gen hhdage_wi2 = hhd_age*wi2
gen hhdage_wi3 = hhd_age*wi3
gen hhdage_wi4 = hhd_age*wi4
gen hhdage_wi5 = hhd_age*wi5
gen cas1_wi1 = cas1*wi1
gen cas1_wi2 = cas1*wi2
gen cas1_wi3 = cas1*wi3
gen cas1_wi4 = cas1*wi4
gen cas1_wi5 = cas1*wi5
gen cas2_wi1 = cas2*wi1
gen cas2_wi2 = cas2*wi2
gen cas2_wi3 = cas2*wi3                       
gen cas2_wi4 = cas2*wi4
gen cas2_wi5 = cas2*wi5
gen wi1_yea = wi1*year_marr 
gen wi2_yea = wi2*year_marr 
gen wi3_yea = wi3*year_marr 
gen wi4_yea = wi4*year_marr 
gen wi5_yea = wi5*year_marr 
gen wi1_pla1 = wi1*pla1
gen wi1_pla2 = wi1*pla2
gen wi2_pla1 = wi2*pla1
gen wi2_pla2 = wi2*pla2
gen wi3_pla1 = wi3*pla1
gen wi3_pla2 = wi3*pla2
gen wi4_pla1 = wi4*pla1
gen wi4_pla2 = wi4*pla2
gen wi5_pla1 = wi5*pla1
gen wi5_pla2 = wi5*pla2


gen hdgen1_siz =  hdgen1*hh_size
gen hdgen2_siz =  hdgen2*hh_size
gen hdgen1_hedu = hdgen1*husb_edu
gen hdgen2_hedu = hdgen2*husb_edu
gen hdgen1_hhdage= hdgen1*hhd_age
gen hdgen2_hhdage= hdgen2*hhd_age
gen cas1_hdgen1 =  cas1*hdgen1
gen cas1_hdgen2 =  cas1*hdgen2
gen cas2_hdgen1 =  cas2*hdgen1
gen cas2_hdgen2 =  cas2*hdgen2
gen hdgen1_yea = hdgen1*year_marr 
gen hdgen2_yea = hdgen2*year_marr 
gen hdgen1_pla1 =  hdgen1*pla1
gen hdgen1_pla2 =  hdgen1*pla2
gen hdgen2_pla1 =  hdgen2*pla1
gen hdgen2_pla2 =  hdgen2*pla2




gen siz_siz =  hh_size*hh_size 
gen hedu_siz = husb_edu*hh_size
gen hhdage_size = hhd_age*hh_size
gen cas1_siz =  cas1*hh_size 
gen cas2_siz =  cas2*hh_size 
gen siz_yea =  hh_size*year_marr 
gen siz_pla1 =  hh_size*pla1
gen siz_pla2 =  hh_size*pla2


gen hedu_hedu = husb_edu*husb_edu
gen hedu_hhdage = husb_edu*hhd_age
gen hedu_cas1 = husb_edu*cas1
gen hedu_cas2 = husb_edu*cas2
gen hedu_yea = husb_edu*year_marr
gen hedu_pla1 = husb_edu*pla1
gen hedu_pla2 = husb_edu*pla2


gen hhdage_hhdage = hhd_age*hhd_age
gen hhdage_cas1 = hhd_age*cas1 
gen hhdage_cas2 = hhd_age*cas2
gen hhdage_yea = hhd_age*year_marr
gen hhdage_pla1 = hhd_age*pla1
gen hhdage_pla2 = hhd_age*pla2


gen cas1_yea =  cas1*year_marr 
gen cas2_yea =  cas2*year_marr 
gen cas1_pla1 = cas1*pla1
gen cas1_pla2 = cas1*pla2
gen cas2_pla1 = cas2*pla1
gen cas2_pla2 = cas2*pla2


gen yea_yea =  year_marr*year_marr
gen pla1_yea =  pla1*year_marr
gen pla2_yea =  pla2*year_marr


* we first create a indicator for bihar pre observations. These observations become our treated group for matching purpose. 
* here we match rest three groups ( post bihar, pre jharkhand and post jharkhand) to pre bihar and create weights.


gen control_year = 1 if year == 0
replace control_year = 0 if year == 1
gen bihar_control = bihar * control_year

* we now match (here bihar_control takes a value of 1 for pre bihar observations)
* we remove base dummies to address dummy trap 

psmatch2 bihar_control resp_age  resp_edu  year_marr  caste  husb_edu poorest poorer middle  richer    religion hhd_gender place_residence hhd_age hh_size  wi1_age wi2_age wi3_age wi4_age   hedu_cas2   hedu_wi1 hedu_wi2 hedu_wi3 hedu_wi4  cas2_rel2    edu_yea rel2_wi1 rel2_wi2 rel2_wi3 rel2_wi4   , kernel common 
 
. psmatch2 bihar_control resp_age  resp_edu  year_marr  caste  husb_edu poorest poorer middle  richer    r
> eligion hhd_gender place_residence hhd_age hh_size  wi1_age wi2_age wi3_age wi4_age   hedu_cas2   hedu_w
> i1 hedu_wi2 hedu_wi3 hedu_wi4  cas2_rel2    edu_yea rel2_wi1 rel2_wi2 rel2_wi3 rel2_wi4   , kernel commo
> n

Probit regression                                       Number of obs = 12,307
                                                        LR chi2(29)   = 999.16
                                                        Prob > chi2   = 0.0000
Log likelihood = -7186.9714                             Pseudo R2     = 0.0650

---------------------------------------------------------------------------------
  bihar_control | Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
----------------+----------------------------------------------------------------
       resp_age |  -.0221628   .0089429    -2.48   0.013    -.0396905    -.004635
       resp_edu |   -.062914    .005447   -11.55   0.000    -.0735898   -.0522381
      year_marr |  -.0173608   .0040614    -4.27   0.000    -.0253209   -.0094007
          caste |    -1.1863   .0855214   -13.87   0.000    -1.353919   -1.018681
       husb_edu |   .0400338   .0191188     2.09   0.036     .0025617    .0775059
        poorest |   .6473258   .4095439     1.58   0.114    -.1553655    1.450017
         poorer |   .3591799     .41264     0.87   0.384    -.4495796    1.167939
         middle |   .4202449   .4243266     0.99   0.322    -.4114199     1.25191
         richer |   .2330906   .4477375     0.52   0.603    -.6444587     1.11064
       religion |   .7942534   .2093117     3.79   0.000     .3840099    1.204497
     hhd_gender |   .2896009   .0306797     9.44   0.000     .2294697    .3497321
place_residence |   .0052869   .0392091     0.13   0.893    -.0715615    .0821353
        hhd_age |   .0013592   .0009901     1.37   0.170    -.0005814    .0032997
        hh_size |   .0559461   .0059434     9.41   0.000     .0442973     .067595
        wi1_age |   .0207821   .0085525     2.43   0.015     .0040194    .0375448
        wi2_age |   .0242828   .0085823     2.83   0.005     .0074618    .0411038
        wi3_age |   .0151131    .008771     1.72   0.085    -.0020777    .0323039
        wi4_age |   .0211968    .009168     2.31   0.021     .0032277    .0391658
      hedu_cas2 |   .0130316    .005614     2.32   0.020     .0020283    .0240349
       hedu_wi1 |  -.0573813   .0195541    -2.93   0.003    -.0957065    -.019056
       hedu_wi2 |  -.0344538    .019747    -1.74   0.081    -.0731573    .0042497
       hedu_wi3 |   -.006599   .0203031    -0.33   0.745    -.0463923    .0331943
       hedu_wi4 |  -.0047517   .0217231    -0.22   0.827    -.0473281    .0378247
      cas2_rel2 |   .7074834   .0865978     8.17   0.000     .5377549     .877212
        edu_yea |   .0020564   .0003519     5.84   0.000     .0013666    .0027462
       rel2_wi1 |  -.6329507   .2158801    -2.93   0.003    -1.056068   -.2098336
       rel2_wi2 |   -.662047   .2206688    -3.00   0.003     -1.09455   -.2295441
       rel2_wi3 |  -.5966915   .2275415    -2.62   0.009    -1.042665   -.1507184
       rel2_wi4 |  -.6517214     .23664    -2.75   0.006    -1.115527   -.1879155
          _cons |  -.9709017   .4083038    -2.38   0.017    -1.771162    -.170641
---------------------------------------------------------------------------------

. 


* we have created weights through the above code 

* this command gives the t test difference, covariance and rubins r and b tst for both matched and unmatched sample (TABLE 2)
pstest , both 

. pstest , both

* YOU GET TABLE 2 NON WEIGHTED AND WEIGHTED SAMPLE THROUGH THE ABOVE CODE . 

----------------------------------------------------------------------------------------
                Unmatched |       Mean               %reduct |     t-test    |  V(T)/
Variable          Matched | Treated Control    %bias  |bias| |    t    p>|t| |  V(C)
--------------------------+----------------------------------+---------------+----------
resp_age               U  | 31.444   32.165     -8.9         |  -4.57  0.000 |  0.96
                       M  | 31.446   31.516     -0.9    90.2 |  -0.39  0.695 |  1.06
                          |                                  |               |
resp_edu               U  | 3.7426   4.6871    -18.9         |  -9.68  0.000 |  0.89*
                       M  | 3.7457    3.782     -0.7    96.2 |  -0.33  0.739 |  1.03
                          |                                  |               |
year_marr              U  | 13.563   14.132     -6.6         |  -3.41  0.001 |  0.94*
                       M  | 13.565   13.694     -1.5    77.2 |  -0.69  0.493 |  1.04
                          |                                  |               |
caste                  U  | .22968   .37866    -32.8         | -16.51  0.000 |     .
                       M  | .22992   .23255     -0.6    98.2 |  -0.27  0.783 |     .
                          |                                  |               |
husb_edu               U  | 6.1976   6.6041     -7.8         |  -4.03  0.000 |  1.03
                       M  | 6.1958   6.2138     -0.3    95.6 |  -0.15  0.879 |  1.04
                          |                                  |               |
poorest                U  | .49423    .4705      4.8         |   2.45  0.014 |     .
                       M  | .49423   .49286      0.3    94.3 |   0.12  0.904 |     .
                          |                                  |               |
poorer                 U  | .24045   .24542     -1.2         |  -0.60  0.550 |     .
                       M  | .24044   .23872      0.4    65.4 |   0.18  0.859 |     .
                          |                                  |               |
middle                 U  | .14483   .13859      1.8         |   0.93  0.354 |     .
                       M  | .14473   .14619     -0.4    76.5 |  -0.18  0.854 |     .
                          |                                  |               |
richer                 U  | .08741   .08851     -0.4         |  -0.20  0.842 |     .
                       M  |  .0875   .08868     -0.4    -7.1 |  -0.18  0.855 |     .
                          |                                  |               |
religion               U  | .85542   .78837     17.6         |   8.84  0.000 |     .
                       M  | .85527   .85293      0.6    96.5 |   0.29  0.770 |     .
                          |                                  |               |
hhd_gender             U  | .24763   .17535     17.8         |   9.39  0.000 |     .
                       M  | .24788   .24847     -0.1    99.2 |  -0.06  0.952 |     .
                          |                                  |               |
place_residence        U  | .14509   .16762     -6.2         |  -3.17  0.002 |     .
                       M  | .14524   .14277      0.7    89.1 |   0.31  0.757 |     .
                          |                                  |               |
hhd_age                U  | 44.585   44.172      3.0         |   1.55  0.122 |  1.11*
                       M  | 44.567   44.546      0.2    94.9 |   0.07  0.947 |  1.02
                          |                                  |               |
hh_size                U  | 5.7006   5.2797     18.5         |   9.83  0.000 |  1.35*
                       M  | 5.6849   5.6475      1.6    91.1 |   0.70  0.483 |  1.06
                          |                                  |               |
wi1_age                U  | 15.352    15.02      2.0         |   1.02  0.307 |  0.96
                       M  | 15.355   15.362     -0.0    98.0 |  -0.02  0.986 |  1.00
                          |                                  |               |
wi2_age                U  | 7.6109    7.798     -1.3         |  -0.68  0.497 |  0.98
                       M  | 7.6084   7.5557      0.4    71.8 |   0.17  0.869 |  1.01
                          |                                  |               |
wi3_age                U  | 4.5411   4.4806      0.5         |   0.27  0.787 |  0.98
                       M  | 4.5381   4.5913     -0.5    12.1 |  -0.20  0.838 |  0.99
                          |                                  |               |
wi4_age                U  |  2.847   2.8814     -0.4         |  -0.19  0.852 |  0.99
                       M  | 2.8499   2.8999     -0.5   -45.4 |  -0.23  0.817 |  0.98
                          |                                  |               |
hedu_cas2              U  | 1.0948   2.0209    -25.9         | -12.80  0.000 |  0.60*
                       M  |  1.096   1.1088     -0.4    98.6 |  -0.18  0.855 |  1.00
                          |                                  |               |
hedu_wi1               U  | 1.8231   2.0325     -5.8         |  -2.95  0.003 |  0.91*
                       M  | 1.8224   1.8225     -0.0   100.0 |  -0.00  0.999 |  1.05
                          |                                  |               |
hedu_wi2               U  | 1.5719   1.6542     -2.2         |  -1.15  0.249 |  0.94
                       M  |  1.571   1.5641      0.2    91.6 |   0.08  0.933 |  1.02
                          |                                  |               |
hedu_wi3               U  | 1.3609   1.2122      4.1         |   2.16  0.031 |  1.14*
                       M  | 1.3593   1.3685     -0.3    93.8 |  -0.11  0.912 |  1.01
                          |                                  |               |
hedu_wi4               U  | .99257   .96253      0.9         |   0.46  0.642 |  1.06
                       M  | .99358   1.0031     -0.3    68.3 |  -0.12  0.902 |  1.00
                          |                                  |               |
cas2_rel2              U  | .21712   .28694    -16.1         |  -8.19  0.000 |     .
                       M  | .21735   .21863     -0.3    98.2 |  -0.14  0.891 |     .
                          |                                  |               |
edu_yea                U  | 40.573   49.812    -12.9         |  -6.58  0.000 |  0.89*
                       M  | 40.605   40.911     -0.4    96.7 |  -0.20  0.844 |  1.05
                          |                                  |               |
rel2_wi1               U  |  .4181   .35736     12.5         |   6.48  0.000 |     .
                       M  | .41801   .41823     -0.0    99.6 |  -0.02  0.985 |     .
                          |                                  |               |
rel2_wi2               U  | .20687   .20081      1.5         |   0.78  0.436 |     .
                       M  | .20683    .2038      0.8    50.1 |   0.33  0.741 |     .
                          |                                  |               |
rel2_wi3               U  | .12535   .11278      3.9         |   2.02  0.043 |     .
                       M  | .12522   .12468      0.2    95.7 |   0.07  0.943 |     .
                          |                                  |               |
rel2_wi4               U  | .07383   .07185      0.8         |   0.39  0.694 |     .
                       M  |  .0739   .07467     -0.3    61.3 |  -0.13  0.898 |     .
                          |                                  |               |
----------------------------------------------------------------------------------------
* if variance ratio outside [0.94; 1.06] for U and [0.94; 1.06] for M

-----------------------------------------------------------------------------------
 Sample    | Ps R2   LR chi2   p>chi2   MeanBias   MedBias      B      R     %Var
-----------+-----------------------------------------------------------------------
 Unmatched | 0.065    999.16    0.000      8.2       4.8      62.3*   0.54     50
 Matched   | 0.000      3.77    1.000      0.5       0.4       4.4    1.15      0
-----------------------------------------------------------------------------------
* if B>25%, R outside [0.5; 2]

. 

 

* we find that after matching there is no statistical difference between covariates. 
* Also the Rubin B and R values are within specified limits 
* conditional independence condition is met on observed covariates 

* this command gives the density function of propensity score for matched an unmatched sample
pstest  _pscore , both density      




* this command gives the distribution of propensity score of both treated and control units (onsupport/offsupport)
psgraph 





* Now we run the DID regression on the matched sample 
* weights are only assigned to observations on support 

* resp_age  resp_edu  year_marr  caste  husb_edu poorest poorer middle  richer    religion hhd_gender place_residence hhd_age hh_size  wi1_age wi2_age wi3_age wi4_age   hedu_cas2   hedu_wi1 hedu_wi2 hedu_wi3 hedu_wi4  cas2_rel2    edu_yea rel2_wi1 rel2_wi2 rel2_wi3 rel2_wi4  

* \ 

* THE FILE ON WHICH WE RUN OUR ANALYSIS IS CREATED NOW.
* save this file as 'main' file 
* you may directly open the main file to run the analysis 





/* FIGURE 2 :  NCRB data graph */
* open the ncrb data file 

* replica of the ncrb graph 

twoway line Bihar Jharkhand cruelitybyhusbandorrelatives, lpattern(solid dash) xlabel(2011(1)2022) xtitle("") ytitle("Number of incidences of cruelty by husband or relatives") xline(2016)


/* FIGURE 3 . Kernel Denisty graph for exact randomization .  */ 

/* Code to run the results of exact randomization test using data of Bihar and Jharkhand from DHS-4(2015-16) and DHS-5(2019-21)  */
/* We perform  randomization test (treatment) on all 5 outcome variables  */
/* Code for result on treatment randomization  */
* open main file 
*REMEMBER THE DENSITY GRAPH MAY CHNAGE DUE TO RANDOMIZATION ITERATION 1000 TIMES. HOWEVER THE RED LINE (ACTUAL ESTIMATE) WILL ALWAYS REMAIN ON THE LEFT END TAIL OF THE GRAPH. 



ssc install ritest

ritest bihar _b[c.bihar#c.year], cluster(distyear) reps(1000) saving(dit1) kdensityplot : reg emo_index_norm c.bihar##c.year resp_age  resp_edu  year_marr  caste  husb_edu poorest poorer middle  richer    religion hhd_gender place_residence hhd_age hh_size  wi1_age wi2_age wi3_age wi4_age   hedu_cas2   hedu_wi1 hedu_wi2 hedu_wi3 hedu_wi4  cas2_rel2    edu_yea rel2_wi1 rel2_wi2 rel2_wi3 rel2_wi4   i.district

ritest bihar _b[c.bihar#c.year], cluster(distyear) reps(1000) saving(ditt1) kdensityplot : reg phy_index_norm c.bihar##c.year resp_age  resp_edu  year_marr  caste  husb_edu poorest poorer middle  richer    religion hhd_gender place_residence hhd_age hh_size  wi1_age wi2_age wi3_age wi4_age   hedu_cas2   hedu_wi1 hedu_wi2 hedu_wi3 hedu_wi4  cas2_rel2    edu_yea rel2_wi1 rel2_wi2 rel2_wi3 rel2_wi4   i.district

ritest bihar _b[c.bihar#c.year], cluster(distyear) reps(1000) saving(ditt3) kdensityplot : reg sexual_index_norm c.bihar##c.year resp_age  resp_edu  year_marr  caste  husb_edu poorest poorer middle  richer    religion hhd_gender place_residence hhd_age hh_size  wi1_age wi2_age wi3_age wi4_age   hedu_cas2   hedu_wi1 hedu_wi2 hedu_wi3 hedu_wi4  cas2_rel2    edu_yea rel2_wi1 rel2_wi2 rel2_wi3 rel2_wi4   i.district 

ritest bihar _b[c.bihar#c.year], cluster(distyear) reps(1000) saving(ditt4) kdensityplot : reg control_index_norm c.bihar##c.year resp_age  resp_edu  year_marr  caste  husb_edu poorest poorer middle  richer    religion hhd_gender place_residence hhd_age hh_size  wi1_age wi2_age wi3_age wi4_age   hedu_cas2   hedu_wi1 hedu_wi2 hedu_wi3 hedu_wi4  cas2_rel2    edu_yea rel2_wi1 rel2_wi2 rel2_wi3 rel2_wi4   i.district

ritest bihar _b[c.bihar#c.year], cluster(distyear) reps(1000) saving(ditt7) kdensityplot : reg ipv_index_norm c.bihar##c.year resp_age  resp_edu  year_marr  caste  husb_edu poorest poorer middle  richer    religion hhd_gender place_residence hhd_age hh_size  wi1_age wi2_age wi3_age wi4_age   hedu_cas2   hedu_wi1 hedu_wi2 hedu_wi3 hedu_wi4  cas2_rel2    edu_yea rel2_wi1 rel2_wi2 rel2_wi3 rel2_wi4   i.district



/* TABLE 1 - SAMPLE MEAN OF OUTCOME VARIABLES (done on outcome dummy variables ) */
/* OPEN THE MAIN FILE */

summ emo_d phy_d sexual_d control_d ipv_d if year==0 & bihar==1
summ emo_d phy_d sexual_d control_d ipv_d if year==0 & bihar==0
summ emo_d phy_d sexual_d control_d ipv_d if year==0 
summ emo_d phy_d sexual_d control_d ipv_d if year==1 & bihar==1
summ emo_d phy_d sexual_d control_d ipv_d if year==1 & bihar==0
summ emo_d phy_d sexual_d control_d ipv_d if year==1
summ emo_d phy_d sexual_d control_d ipv_d 





/* TABLE 2 - Balance of the covariates: weighted and unweighted samples */
* REFER TO CODE LINE 2364 FOR THIS TABLE 





/* TABLE 3 - PARALLEL TRENDS  */
/* TABLE 3  . Parallel trends */ 
/* Open pta 05 06 15 16 bihar jharkhand main file from the parallel trend folder */ 

* create the standardized outcome variables 

* We standardize outcome variables separately for baseline and endline datasets 
* for baseline observations 
* mean and sd of outcome variables for control group (Jharkhand) at baseline 
* remember there are only two indicators of sexual violence in 2005-06 database

summ emo1 emo2 emo3 phy1 phy2 phy3 phy4 phy5 phy6  phy7 sexual1 sexual2  control1 control2 control3 control4 control5 control6 if bihar == 0 & year==0


    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
        emo1 |      1,566    .1047254    .3062971          0          1
        emo2 |      1,567    .0561583    .2303006          0          1
        emo3 |      1,567    .0491385    .2162261          0          1
        phy1 |      1,567     .098277    .2977838          0          1
        phy2 |      1,567    .2259094    .4183133          0          1
-------------+---------------------------------------------------------
        phy3 |      1,566     .087484    .2826333          0          1
        phy4 |      1,566    .0747126    .2630111          0          1
        phy5 |      1,567    .0204212    .1414812          0          1
        phy6 |      1,567    .0121251    .1094793          0          1
        phy7 |      1,566    .1111111    .3143701          0          1
-------------+---------------------------------------------------------
     sexual1 |      1,567    .0823229    .2749438          0          1
     sexual2 |      1,567    .0127632    .1122871          0          1
    control1 |      1,560    .2891026     .453491          0          1
    control2 |      1,564    .0831202    .2761521          0          1
    control3 |      1,565    .0971246     .296222          0          1
-------------+---------------------------------------------------------
    control4 |      1,566    .1749681    .3800612          0          1
    control5 |      1,566    .2209451    .4150161          0          1
    control6 |      1,565    .1309904    .3374977          0          1


	
* standardize all outcome variables for all observations at baseline (bihar and Jharkhand)

gen emo_1_norm = (emo1 - .1047254)/.3062971 if year==0
gen emo_2_norm = (emo2 - .0561583)/.2303006 if year==0
gen emo_3_norm = (emo3 - .0491385 )/.2162261 if year==0

gen phy_1_norm = (phy1 - .098277)/.2977838 if year==0
gen phy_2_norm = (phy2 - .2259094  )/.4183133  if year==0
gen phy_3_norm = (phy3 - .087484)/.2826333 if year==0
gen phy_4_norm = (phy4 - .0747126)/.2630111 if year==0
gen phy_5_norm = (phy5 - .0204212)/.1414812 if year==0
gen phy_6_norm = (phy6 - .0121251)/.1094793 if year==0
gen phy_7_norm = (phy7 - .1111111)/.3143701  if year==0

gen sexual_1_norm = (sexual1 - .0823229  )/.2749438 if year==0
gen sexual_2_norm = (sexual2 - .0127632 )/.1122871  if year==0 

gen control_1_norm = (control1 - .2891026 )/.453491  if year==0
gen control_2_norm = (control2 - .0831202)/.2761521 if year==0 
gen control_3_norm = (control3 - .0971246  )/.296222  if year==0
gen control_4_norm = (control4 - .1749681 )/.3800612  if year==0
gen control_5_norm = (control5 - .2209451 )/.4150161  if year==0 
gen control_6_norm = (control6 -  .1309904 )/.3374977 if year==0

* for endline observations 
* mean and sd of outcome varibales for control group (Jharkhand) at endline
summ emo1 emo2 emo3 phy1 phy2 phy3 phy4 phy5 phy6  phy7 sexual1 sexual2  control1 control2 control3 control4 control5 control6 if bihar == 0 & year==1


    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
        emo1 |      2,524    .0412044    .1988021          0          1
        emo2 |      2,524    .0388273    .1932214          0          1
        emo3 |      2,524    .0348653    .1834749          0          1
        phy1 |      2,524     .086767    .2815492          0          1
        phy2 |      2,524    .2044374    .4033698          0          1
-------------+---------------------------------------------------------
        phy3 |      2,524    .0736926    .2613217          0          1
        phy4 |      2,524    .0526941    .2234664          0          1
        phy5 |      2,524    .0110935    .1047606          0          1
        phy6 |      2,524    .0055468    .0742844          0          1
        phy7 |      2,524     .088748     .284436          0          1
-------------+---------------------------------------------------------
     sexual1 |      2,524    .0455626    .2085758          0          1
     sexual2 |      2,524    .0178288    .1323553          0          1
    control1 |      2,522    .2664552    .4421927          0          1
    control2 |      2,523    .0757035    .2645756          0          1
    control3 |      2,521    .3201111    .4666115          0          1
-------------+---------------------------------------------------------
    control4 |      2,520     .168254    .3741659          0          1
    control5 |      2,521    .2427608    .4288367          0          1
    control6 |      2,519    .4096864    .4918735          0          1


* standardize all outcome variables for all observations at endline (bihar and Jharkhand)

replace emo_1_norm = (emo1 - .0412044)/.1988021  if year==1
replace emo_2_norm = (emo2 - .0388273)/.1932214  if year==1
replace emo_3_norm = (emo3 - .0348653)/.1834749  if year==1

replace phy_1_norm = (phy1 - .086767)/.2815492  if year==1
replace phy_2_norm = (phy2 - .2044374 )/.4033698 if year==1
replace phy_3_norm = (phy3 - .0736926  )/.2613217  if year==1
replace phy_4_norm = (phy4 - .0526941)/.2234664 if year==1
replace phy_5_norm = (phy5 - .0110935 )/.1047606  if year==1
replace phy_6_norm = (phy6 - .0055468  )/.0742844   if year==1
replace phy_7_norm = (phy7 - .088748  )/.284436 if year==1

replace sexual_1_norm = (sexual1 - .0455626 )/.2085758  if year==1
replace sexual_2_norm = (sexual2 -  .0178288  )/ .1323553  if year==1 


replace control_1_norm = (control1 - .2664552 )/.4421927  if year==1
replace control_2_norm = (control2 - .0757035)/.2645756  if year==1 
replace control_3_norm = (control3 -  .3201111  )/.4666115 if year==1
replace control_4_norm = (control4 - .168254 )/.3741659   if year==1
replace control_5_norm = (control5 - .2427608 )/.4288367 if year==1 
replace control_6_norm = (control6 -  .4096864  )/.4918735   if year==1


gen emo_sum = emo_1_norm + emo_2_norm + emo_3_norm
gen phy_sum = phy_1_norm + phy_2_norm + phy_3_norm + phy_4_norm + phy_5_norm + phy_6_norm + phy_7_norm
gen sexual_sum = sexual_1_norm + sexual_2_norm 
gen control_sum = control_1_norm + control_2_norm + control_3_norm + control_4_norm + control_5_norm + control_6_norm
gen ipv_sum = emo_1_norm + emo_2_norm + emo_3_norm + phy_1_norm + phy_2_norm + phy_3_norm + phy_4_norm + phy_5_norm + phy_6_norm + phy_7_norm + sexual_1_norm + sexual_2_norm +  control_1_norm + control_2_norm + control_3_norm + control_4_norm + control_5_norm + control_6_norm


* we standardize the outcome index again separately for baseline and endline datasets 
* for baseline observations 
* mean and sd of outcome index for control group (Jharkhand) at baseline 

sum emo_sum phy_sum sexual_sum control_sum ipv_sum if bihar == 0 & year == 0 


    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
     emo_sum |      1,566    .0003006    2.376432  -.8130109   11.41873
     phy_sum |      1,564   -.0003065    4.877216  -2.072206   30.39995
  sexual_sum |      1,567    4.04e-07    1.623313   -.413083   12.12977
 control_sum |      1,553   -.0122334    3.725419  -2.647244   14.55859
     ipv_sum |      1,549   -.0232251    9.708335  -5.945544   68.50704




* we standardize the index variables for all observations at baseline (bihar and Jharkhand)

gen emo_index_norm     = (emo_sum -(.0003006))/2.376432  if year==0
gen phy_index_norm     = (phy_sum - (-.0003065))/4.877216  if year==0
gen sexual_index_norm  = (sexual_sum - (4.04e-07 ))/1.623313  if year==0
gen control_index_norm = (control_sum - (-.0122334 ))/3.725419  if year==0
gen ipv_index_norm     = (ipv_sum - (-.0232251))/ 9.708335  if year==0  


* for endline observations 
* mean and sd of outcome index for control group (Jharkhand) at endline 
sum emo_sum phy_sum sexual_sum control_sum ipv_sum if bihar == 0 & year == 1


    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
     emo_sum |      2,524   -6.71e-08    2.405822  -.5982382   15.05764
     phy_sum |      2,524   -4.89e-07    4.679791  -1.825381   39.03024
  sexual_sum |      2,524    3.49e-07    1.691154  -.3531503   11.99669
 control_sum |      2,512    .0011103    3.692974  -3.423421   11.79833
     ipv_sum |      2,512   -.0171749    8.790707  -6.200191   74.36716




* we standardize the index variables for all observations at endline (bihar and Jharkhand)

replace emo_index_norm     = (emo_sum -(-6.71e-08))/2.405822  if year==1
replace phy_index_norm     = (phy_sum - (-4.89e-07))/4.679791  if year==1
replace sexual_index_norm  = (sexual_sum - ( 3.49e-07 ))/1.691154  if year==1
replace control_index_norm = (control_sum - (.0011103))/3.692974  if year==1
replace ipv_index_norm     = (ipv_sum - (-.0171749 ))/ 8.790707  if year==1   	 


* we have standardized all outcome variables .



* generate all interaction and quadratic terms 

* we first create dummies for each categorical variable 

tabulate religion, generate(rel)
tabulate wealth_index, generate(wi)
tabulate hhd_gender, generate(hdgen)
tabulate caste, generate(cas)
tabulate place_residence, generate(pla)

* now we create interaction terms 
 
gen age_age = resp_age*resp_age
gen edu_age = resp_edu*resp_age
gen rel1_age =  (rel1)*resp_age
gen rel2_age =  (rel2)*resp_age
gen wi1_age = (wi1)*resp_age 
gen wi2_age = (wi2)*resp_age 
gen wi3_age = (wi3)*resp_age 
gen wi4_age = (wi4)*resp_age 
gen wi5_age = (wi5)*resp_age 
gen hdgen1_age = hdgen1*resp_age
gen hdgen2_age = hdgen2*resp_age
gen siz_age =  hh_size*resp_age 
gen hedu_age = husb_edu*resp_age
gen hhdage_age = hhd_age*resp_age
gen cas1_age =  cas1*resp_age
gen cas2_age =  cas2*resp_age
gen yea_age =  year_marr*resp_age
gen pla1_age =  pla1*resp_age
gen pla2_age =  pla2*resp_age

gen edu_edu =  resp_edu*resp_edu 
gen rel1_edu =  rel1*resp_edu
gen rel2_edu =  rel2*resp_edu
gen wi1_edu = wi1*resp_edu
gen wi2_edu = wi2*resp_edu
gen wi3_edu = wi3*resp_edu
gen wi4_edu = wi4*resp_edu
gen wi5_edu = wi5*resp_edu
gen hdgen1_edu = hdgen1*resp_edu
gen hdgen2_edu = hdgen2*resp_edu
gen edu_siz =  resp_edu*hh_size 
gen hedu_edu = husb_edu*resp_edu
gen hhdage_edu = hhd_age*resp_edu
gen cas1_edu = cas1*resp_edu
gen cas2_edu = cas2*resp_edu
gen edu_yea = resp_edu*year_marr 
gen edu_pla1 = resp_edu*pla1
gen edu_pla2 = resp_edu*pla2


gen rel1_wi1 =  rel1*wi1
gen rel1_wi2 =  rel1*wi2
gen rel1_wi3 =  rel1*wi3
gen rel1_wi4 =  rel1*wi4
gen rel1_wi5 =  rel1*wi5
gen rel2_wi1 =  rel2*wi1
gen rel2_wi2 =  rel2*wi2
gen rel2_wi3 =  rel2*wi3
gen rel2_wi4 =  rel2*wi4
gen rel2_wi5 =  rel2*wi5
gen hdgen1_rel1 =  hdgen1*rel1
gen hdgen1_rel2 =  hdgen1*rel2
gen hdgen2_rel1 =  hdgen2*rel1
gen hdgen2_rel2 =  hdgen2*rel2
gen rel1_siz =  rel1*hh_size
gen rel2_siz =  rel2*hh_size
gen hedu_rel1 = husb_edu*rel1
gen hedu_rel2 = husb_edu*rel2
gen hhdage_rel1 = hhd_age*rel1
gen hhdage_rel2 = hhd_age*rel2
gen cas1_rel1 =  cas1*rel1
gen cas1_rel2 =  cas1*rel2
gen cas2_rel1 =  cas2*rel1
gen cas2_rel2 =  cas2*rel2
gen rel1_yea =  rel1*year_marr
gen rel2_yea =  rel2*year_marr
gen rel1_pla1 =  rel1*pla1
gen rel1_pla2 =  rel1*pla2
gen rel2_pla1 =  rel2*pla1
gen rel2_pla2 =  rel2*pla2



gen hdgen1_wi1 = hdgen1*wi1
gen hdgen1_wi2 = hdgen1*wi2
gen hdgen1_wi3 = hdgen1*wi3
gen hdgen1_wi4 = hdgen1*wi4
gen hdgen1_wi5 = hdgen1*wi5
gen hdgen2_wi1 = hdgen2*wi1
gen hdgen2_wi2 = hdgen2*wi2
gen hdgen2_wi3 = hdgen2*wi3
gen hdgen2_wi4 = hdgen2*wi4
gen hdgen2_wi5 = hdgen2*wi5
gen wi1_siz = wi1*hh_size
gen wi2_siz = wi2*hh_size
gen wi3_siz = wi3*hh_size
gen wi4_siz = wi4*hh_size
gen wi5_siz = wi5*hh_size
gen hedu_wi1 = husb_edu*wi1
gen hedu_wi2 = husb_edu*wi2
gen hedu_wi3 = husb_edu*wi3
gen hedu_wi4 = husb_edu*wi4
gen hedu_wi5 = husb_edu*wi5
gen hhdage_wi1 = hhd_age*wi1
gen hhdage_wi2 = hhd_age*wi2
gen hhdage_wi3 = hhd_age*wi3
gen hhdage_wi4 = hhd_age*wi4
gen hhdage_wi5 = hhd_age*wi5
gen cas1_wi1 = cas1*wi1
gen cas1_wi2 = cas1*wi2
gen cas1_wi3 = cas1*wi3
gen cas1_wi4 = cas1*wi4
gen cas1_wi5 = cas1*wi5
gen cas2_wi1 = cas2*wi1
gen cas2_wi2 = cas2*wi2
gen cas2_wi3 = cas2*wi3                       
gen cas2_wi4 = cas2*wi4
gen cas2_wi5 = cas2*wi5
gen wi1_yea = wi1*year_marr 
gen wi2_yea = wi2*year_marr 
gen wi3_yea = wi3*year_marr 
gen wi4_yea = wi4*year_marr 
gen wi5_yea = wi5*year_marr 
gen wi1_pla1 = wi1*pla1
gen wi1_pla2 = wi1*pla2
gen wi2_pla1 = wi2*pla1
gen wi2_pla2 = wi2*pla2
gen wi3_pla1 = wi3*pla1
gen wi3_pla2 = wi3*pla2
gen wi4_pla1 = wi4*pla1
gen wi4_pla2 = wi4*pla2
gen wi5_pla1 = wi5*pla1
gen wi5_pla2 = wi5*pla2


gen hdgen1_siz =  hdgen1*hh_size
gen hdgen2_siz =  hdgen2*hh_size
gen hdgen1_hedu = hdgen1*husb_edu
gen hdgen2_hedu = hdgen2*husb_edu
gen hdgen1_hhdage= hdgen1*hhd_age
gen hdgen2_hhdage= hdgen2*hhd_age
gen cas1_hdgen1 =  cas1*hdgen1
gen cas1_hdgen2 =  cas1*hdgen2
gen cas2_hdgen1 =  cas2*hdgen1
gen cas2_hdgen2 =  cas2*hdgen2
gen hdgen1_yea = hdgen1*year_marr 
gen hdgen2_yea = hdgen2*year_marr 
gen hdgen1_pla1 =  hdgen1*pla1
gen hdgen1_pla2 =  hdgen1*pla2
gen hdgen2_pla1 =  hdgen2*pla1
gen hdgen2_pla2 =  hdgen2*pla2




gen siz_siz =  hh_size*hh_size 
gen hedu_siz = husb_edu*hh_size
gen hhdage_size = hhd_age*hh_size
gen cas1_siz =  cas1*hh_size 
gen cas2_siz =  cas2*hh_size 
gen siz_yea =  hh_size*year_marr 
gen siz_pla1 =  hh_size*pla1
gen siz_pla2 =  hh_size*pla2


gen hedu_hedu = husb_edu*husb_edu
gen hedu_hhdage = husb_edu*hhd_age
gen hedu_cas1 = husb_edu*cas1
gen hedu_cas2 = husb_edu*cas2
gen hedu_yea = husb_edu*year_marr
gen hedu_pla1 = husb_edu*pla1
gen hedu_pla2 = husb_edu*pla2


gen hhdage_hhdage = hhd_age*hhd_age
gen hhdage_cas1 = hhd_age*cas1 
gen hhdage_cas2 = hhd_age*cas2
gen hhdage_yea = hhd_age*year_marr
gen hhdage_pla1 = hhd_age*pla1
gen hhdage_pla2 = hhd_age*pla2


gen cas1_yea =  cas1*year_marr 
gen cas2_yea =  cas2*year_marr 
gen cas1_pla1 = cas1*pla1
gen cas1_pla2 = cas1*pla2
gen cas2_pla1 = cas2*pla1
gen cas2_pla2 = cas2*pla2


gen yea_yea =  year_marr*year_marr
gen pla1_yea =  pla1*year_marr
gen pla2_yea =  pla2*year_marr

* we run the matching code 
* we first create a indicator for bihar pre observations. These observations become our treated group for matching purpose. 
* here we match rest three groups ( post bihar, pre jharkhand and post jharkhand) to pre bihar and create weights.

gen control_year = 1 if year == 0
replace control_year = 0 if year == 1
gen bihar_control = bihar * control_year

* we now match (here bihar_control takes a value of 1 for pre bihar observations)
* we remove base dummies to address dummy trap 

psmatch2 bihar_control resp_age  resp_edu  year_marr  caste  husb_edu poorest poorer middle  richer    religion hhd_gender place_residence hhd_age hh_size  wi1_age wi2_age wi3_age wi4_age   hedu_cas2   hedu_wi1 hedu_wi2 hedu_wi3 hedu_wi4  cas2_rel2    edu_yea rel2_wi1 rel2_wi2 rel2_wi3 rel2_wi4   , kernel common 



* DID with controls and FE
* WE DO NOT INCLUDE DISTRICT FIXED EFFECTS SINCE DISTRICT IDENTIFIERS ARE NOT AVAILABALE FOR DHS 05-06 DATASET
* WE CLUSTER THE STANDARD ERROR AT THE CLUSTER (PSU) LEVEL 
 

diff emo_index_norm [aw=_weight] , p(year)  t(bihar)  cov(resp_age  resp_edu  year_marr  caste  husb_edu poorest poorer middle  richer    religion hhd_gender place_residence hhd_age hh_size  wi1_age wi2_age wi3_age wi4_age   hedu_cas2   hedu_wi1 hedu_wi2 hedu_wi3 hedu_wi4  cas2_rel2    edu_yea rel2_wi1 rel2_wi2 rel2_wi3 rel2_wi4  ) report cluster(cluster) robust

diff phy_index_norm [aw=_weight] , p(year)  t(bihar)  cov(resp_age  resp_edu  year_marr  caste  husb_edu poorest poorer middle  richer    religion hhd_gender place_residence hhd_age hh_size  wi1_age wi2_age wi3_age wi4_age   hedu_cas2   hedu_wi1 hedu_wi2 hedu_wi3 hedu_wi4  cas2_rel2    edu_yea rel2_wi1 rel2_wi2 rel2_wi3 rel2_wi4  ) report cluster(cluster) robust

diff sexual_index_norm [aw=_weight] , p(year)  t(bihar)  cov(resp_age  resp_edu  year_marr  caste  husb_edu poorest poorer middle  richer    religion hhd_gender place_residence hhd_age hh_size  wi1_age wi2_age wi3_age wi4_age   hedu_cas2   hedu_wi1 hedu_wi2 hedu_wi3 hedu_wi4  cas2_rel2    edu_yea rel2_wi1 rel2_wi2 rel2_wi3 rel2_wi4  ) report cluster(cluster) robust

diff control_index_norm [aw=_weight] , p(year)  t(bihar)  cov(resp_age  resp_edu  year_marr  caste  husb_edu poorest poorer middle  richer    religion hhd_gender place_residence hhd_age hh_size  wi1_age wi2_age wi3_age wi4_age   hedu_cas2   hedu_wi1 hedu_wi2 hedu_wi3 hedu_wi4  cas2_rel2    edu_yea rel2_wi1 rel2_wi2 rel2_wi3 rel2_wi4  ) report cluster(cluster) robust

diff ipv_index_norm [aw=_weight] , p(year)  t(bihar)  cov(resp_age  resp_edu  year_marr  caste  husb_edu poorest poorer middle  richer    religion hhd_gender place_residence hhd_age hh_size  wi1_age wi2_age wi3_age wi4_age   hedu_cas2   hedu_wi1 hedu_wi2 hedu_wi3 hedu_wi4  cas2_rel2    edu_yea rel2_wi1 rel2_wi2 rel2_wi3 rel2_wi4  ) report cluster(cluster) robust


/*TABLE 4 : Impact of alcohol ban on intimate partner violence */ 

* OPEN THE MAIN FILE AND RUN THE FOLLOWING COMMANDS 

* without controls and FE

diff emo_index_norm [aw=_weight] , p(year)  t(bihar)   cluster(distyear) robust

diff phy_index_norm [aw=_weight] , p(year)  t(bihar)   cluster(distyear) robust

diff sexual_index_norm [aw=_weight] , p(year)  t(bihar)   cluster(distyear) robust

diff control_index_norm [aw=_weight] , p(year)  t(bihar)   cluster(distyear) robust

diff ipv_index_norm [aw=_weight] , p(year)  t(bihar)   cluster(distyear) robust


* with controls and FE
 

diff emo_index_norm [aw=_weight] , p(year)  t(bihar)  cov(resp_age  resp_edu  year_marr  caste  husb_edu poorest poorer middle  richer    religion hhd_gender place_residence hhd_age hh_size  wi1_age wi2_age wi3_age wi4_age   hedu_cas2   hedu_wi1 hedu_wi2 hedu_wi3 hedu_wi4  cas2_rel2    edu_yea rel2_wi1 rel2_wi2 rel2_wi3 rel2_wi4   dist1 dist2 dist3 dist4 dist5 dist6 dist7 dist8 dist9 dist10 dist11 dist12 dist13 dist14 dist15 dist16 dist17 dist18 dist19 dist20 dist21 dist22 dist23 dist24 dist25 dist26 dist27 dist28 dist29 dist30 dist31 dist32 dist33 dist34 dist35 dist36 dist37 dist38 dist39 dist40 dist41 dist42 dist43 dist44 dist45 dist46 dist47 dist48 dist49 dist50 dist51 dist52 dist53 dist54 dist55 dist56 dist57 dist58 dist59 dist60 dist61 dist62) report cluster(distyear) robust



diff phy_index_norm [aw=_weight] , p(year)  t(bihar)  cov(resp_age  resp_edu  year_marr  caste  husb_edu poorest poorer middle  richer    religion hhd_gender place_residence hhd_age hh_size  wi1_age wi2_age wi3_age wi4_age   hedu_cas2   hedu_wi1 hedu_wi2 hedu_wi3 hedu_wi4  cas2_rel2    edu_yea rel2_wi1 rel2_wi2 rel2_wi3 rel2_wi4   dist1 dist2 dist3 dist4 dist5 dist6 dist7 dist8 dist9 dist10 dist11 dist12 dist13 dist14 dist15 dist16 dist17 dist18 dist19 dist20 dist21 dist22 dist23 dist24 dist25 dist26 dist27 dist28 dist29 dist30 dist31 dist32 dist33 dist34 dist35 dist36 dist37 dist38 dist39 dist40 dist41 dist42 dist43 dist44 dist45 dist46 dist47 dist48 dist49 dist50 dist51 dist52 dist53 dist54 dist55 dist56 dist57 dist58 dist59 dist60 dist61 dist62 dist62 ) report cluster(distyear) robust

diff sexual_index_norm [aw=_weight] , p(year)  t(bihar)  cov(resp_age  resp_edu  year_marr  caste  husb_edu poorest poorer middle  richer    religion hhd_gender place_residence hhd_age hh_size  wi1_age wi2_age wi3_age wi4_age   hedu_cas2   hedu_wi1 hedu_wi2 hedu_wi3 hedu_wi4  cas2_rel2    edu_yea rel2_wi1 rel2_wi2 rel2_wi3 rel2_wi4   dist1 dist2 dist3 dist4 dist5 dist6 dist7 dist8 dist9 dist10 dist11 dist12 dist13 dist14 dist15 dist16 dist17 dist18 dist19 dist20 dist21 dist22 dist23 dist24 dist25 dist26 dist27 dist28 dist29 dist30 dist31 dist32 dist33 dist34 dist35 dist36 dist37 dist38 dist39 dist40 dist41 dist42 dist43 dist44 dist45 dist46 dist47 dist48 dist49 dist50 dist51 dist52 dist53 dist54 dist55 dist56 dist57 dist58 dist59 dist60 dist61 dist62 dist62 ) report cluster(distyear) robust 

diff control_index_norm [aw=_weight] , p(year)  t(bihar)  cov(resp_age  resp_edu  year_marr  caste  husb_edu poorest poorer middle  richer    religion hhd_gender place_residence hhd_age hh_size  wi1_age wi2_age wi3_age wi4_age   hedu_cas2   hedu_wi1 hedu_wi2 hedu_wi3 hedu_wi4  cas2_rel2    edu_yea rel2_wi1 rel2_wi2 rel2_wi3 rel2_wi4   dist1 dist2 dist3 dist4 dist5 dist6 dist7 dist8 dist9 dist10 dist11 dist12 dist13 dist14 dist15 dist16 dist17 dist18 dist19 dist20 dist21 dist22 dist23 dist24 dist25 dist26 dist27 dist28 dist29 dist30 dist31 dist32 dist33 dist34 dist35 dist36 dist37 dist38 dist39 dist40 dist41 dist42 dist43 dist44 dist45 dist46 dist47 dist48 dist49 dist50 dist51 dist52 dist53 dist54 dist55 dist56 dist57 dist58 dist59 dist60 dist61 dist62 dist62 ) report cluster(distyear) robust

diff ipv_index_norm [aw=_weight] , p(year)  t(bihar)  cov(resp_age  resp_edu  year_marr  caste  husb_edu poorest poorer middle  richer    religion hhd_gender place_residence hhd_age hh_size  wi1_age wi2_age wi3_age wi4_age   hedu_cas2   hedu_wi1 hedu_wi2 hedu_wi3 hedu_wi4  cas2_rel2    edu_yea rel2_wi1 rel2_wi2 rel2_wi3 rel2_wi4   dist1 dist2 dist3 dist4 dist5 dist6 dist7 dist8 dist9 dist10 dist11 dist12 dist13 dist14 dist15 dist16 dist17 dist18 dist19 dist20 dist21 dist22 dist23 dist24 dist25 dist26 dist27 dist28 dist29 dist30 dist31 dist32 dist33 dist34 dist35 dist36 dist37 dist38 dist39 dist40 dist41 dist42 dist43 dist44 dist45 dist46 dist47 dist48 dist49 dist50 dist51 dist52 dist53 dist54 dist55 dist56 dist57 dist58 dist59 dist60 dist61 dist62 dist62 ) report cluster(distyear) robust







/*TABLE 5 : Impact of alcohol ban on individual indicators of IPV */ 

* OPEN THE MAIN FILE AND RUN THE BELOW COMMANDS 

diff emo_1_norm [aw=_weight] , p(year)  t(bihar)  cov(resp_age  resp_edu  year_marr  caste  husb_edu poorest poorer middle  richer    religion hhd_gender place_residence hhd_age hh_size  wi1_age wi2_age wi3_age wi4_age   hedu_cas2   hedu_wi1 hedu_wi2 hedu_wi3 hedu_wi4  cas2_rel2    edu_yea rel2_wi1 rel2_wi2 rel2_wi3 rel2_wi4   dist1 dist2 dist3 dist4 dist5 dist6 dist7 dist8 dist9 dist10 dist11 dist12 dist13 dist14 dist15 dist16 dist17 dist18 dist19 dist20 dist21 dist22 dist23 dist24 dist25 dist26 dist27 dist28 dist29 dist30 dist31 dist32 dist33 dist34 dist35 dist36 dist37 dist38 dist39 dist40 dist41 dist42 dist43 dist44 dist45 dist46 dist47 dist48 dist49 dist50 dist51 dist52 dist53 dist54 dist55 dist56 dist57 dist58 dist59 dist60 dist61 dist62 dist62 ) report cluster(distyear) robust

diff emo_2_norm [aw=_weight] , p(year)  t(bihar)  cov(resp_age  resp_edu  year_marr  caste  husb_edu poorest poorer middle  richer    religion hhd_gender place_residence hhd_age hh_size  wi1_age wi2_age wi3_age wi4_age   hedu_cas2   hedu_wi1 hedu_wi2 hedu_wi3 hedu_wi4  cas2_rel2    edu_yea rel2_wi1 rel2_wi2 rel2_wi3 rel2_wi4   dist1 dist2 dist3 dist4 dist5 dist6 dist7 dist8 dist9 dist10 dist11 dist12 dist13 dist14 dist15 dist16 dist17 dist18 dist19 dist20 dist21 dist22 dist23 dist24 dist25 dist26 dist27 dist28 dist29 dist30 dist31 dist32 dist33 dist34 dist35 dist36 dist37 dist38 dist39 dist40 dist41 dist42 dist43 dist44 dist45 dist46 dist47 dist48 dist49 dist50 dist51 dist52 dist53 dist54 dist55 dist56 dist57 dist58 dist59 dist60 dist61 dist62 dist62 ) report cluster(distyear) robust

diff emo_3_norm [aw=_weight] , p(year)  t(bihar)  cov(resp_age  resp_edu  year_marr  caste  husb_edu poorest poorer middle  richer    religion hhd_gender place_residence hhd_age hh_size  wi1_age wi2_age wi3_age wi4_age   hedu_cas2   hedu_wi1 hedu_wi2 hedu_wi3 hedu_wi4  cas2_rel2    edu_yea rel2_wi1 rel2_wi2 rel2_wi3 rel2_wi4   dist1 dist2 dist3 dist4 dist5 dist6 dist7 dist8 dist9 dist10 dist11 dist12 dist13 dist14 dist15 dist16 dist17 dist18 dist19 dist20 dist21 dist22 dist23 dist24 dist25 dist26 dist27 dist28 dist29 dist30 dist31 dist32 dist33 dist34 dist35 dist36 dist37 dist38 dist39 dist40 dist41 dist42 dist43 dist44 dist45 dist46 dist47 dist48 dist49 dist50 dist51 dist52 dist53 dist54 dist55 dist56 dist57 dist58 dist59 dist60 dist61 dist62 dist62 ) report cluster(distyear) robust

diff phy_1_norm [aw=_weight] , p(year)  t(bihar)  cov(resp_age  resp_edu  year_marr  caste  husb_edu poorest poorer middle  richer    religion hhd_gender place_residence hhd_age hh_size  wi1_age wi2_age wi3_age wi4_age   hedu_cas2   hedu_wi1 hedu_wi2 hedu_wi3 hedu_wi4  cas2_rel2    edu_yea rel2_wi1 rel2_wi2 rel2_wi3 rel2_wi4   dist1 dist2 dist3 dist4 dist5 dist6 dist7 dist8 dist9 dist10 dist11 dist12 dist13 dist14 dist15 dist16 dist17 dist18 dist19 dist20 dist21 dist22 dist23 dist24 dist25 dist26 dist27 dist28 dist29 dist30 dist31 dist32 dist33 dist34 dist35 dist36 dist37 dist38 dist39 dist40 dist41 dist42 dist43 dist44 dist45 dist46 dist47 dist48 dist49 dist50 dist51 dist52 dist53 dist54 dist55 dist56 dist57 dist58 dist59 dist60 dist61 dist62 dist62 ) report cluster(distyear) robust

diff phy_2_norm [aw=_weight] , p(year)  t(bihar)  cov(resp_age  resp_edu  year_marr  caste  husb_edu poorest poorer middle  richer    religion hhd_gender place_residence hhd_age hh_size  wi1_age wi2_age wi3_age wi4_age   hedu_cas2   hedu_wi1 hedu_wi2 hedu_wi3 hedu_wi4  cas2_rel2    edu_yea rel2_wi1 rel2_wi2 rel2_wi3 rel2_wi4   dist1 dist2 dist3 dist4 dist5 dist6 dist7 dist8 dist9 dist10 dist11 dist12 dist13 dist14 dist15 dist16 dist17 dist18 dist19 dist20 dist21 dist22 dist23 dist24 dist25 dist26 dist27 dist28 dist29 dist30 dist31 dist32 dist33 dist34 dist35 dist36 dist37 dist38 dist39 dist40 dist41 dist42 dist43 dist44 dist45 dist46 dist47 dist48 dist49 dist50 dist51 dist52 dist53 dist54 dist55 dist56 dist57 dist58 dist59 dist60 dist61 dist62 dist62 ) report cluster(distyear) robust

diff phy_3_norm [aw=_weight] , p(year)  t(bihar)  cov(resp_age  resp_edu  year_marr  caste  husb_edu poorest poorer middle  richer    religion hhd_gender place_residence hhd_age hh_size  wi1_age wi2_age wi3_age wi4_age   hedu_cas2   hedu_wi1 hedu_wi2 hedu_wi3 hedu_wi4  cas2_rel2    edu_yea rel2_wi1 rel2_wi2 rel2_wi3 rel2_wi4   dist1 dist2 dist3 dist4 dist5 dist6 dist7 dist8 dist9 dist10 dist11 dist12 dist13 dist14 dist15 dist16 dist17 dist18 dist19 dist20 dist21 dist22 dist23 dist24 dist25 dist26 dist27 dist28 dist29 dist30 dist31 dist32 dist33 dist34 dist35 dist36 dist37 dist38 dist39 dist40 dist41 dist42 dist43 dist44 dist45 dist46 dist47 dist48 dist49 dist50 dist51 dist52 dist53 dist54 dist55 dist56 dist57 dist58 dist59 dist60 dist61 dist62 dist62 ) report cluster(distyear) robust

diff phy_4_norm [aw=_weight] , p(year)  t(bihar)  cov(resp_age  resp_edu  year_marr  caste  husb_edu poorest poorer middle  richer    religion hhd_gender place_residence hhd_age hh_size  wi1_age wi2_age wi3_age wi4_age   hedu_cas2   hedu_wi1 hedu_wi2 hedu_wi3 hedu_wi4  cas2_rel2    edu_yea rel2_wi1 rel2_wi2 rel2_wi3 rel2_wi4   dist1 dist2 dist3 dist4 dist5 dist6 dist7 dist8 dist9 dist10 dist11 dist12 dist13 dist14 dist15 dist16 dist17 dist18 dist19 dist20 dist21 dist22 dist23 dist24 dist25 dist26 dist27 dist28 dist29 dist30 dist31 dist32 dist33 dist34 dist35 dist36 dist37 dist38 dist39 dist40 dist41 dist42 dist43 dist44 dist45 dist46 dist47 dist48 dist49 dist50 dist51 dist52 dist53 dist54 dist55 dist56 dist57 dist58 dist59 dist60 dist61 dist62 dist62 ) report cluster(distyear) robust

diff phy_5_norm [aw=_weight] , p(year)  t(bihar)  cov(resp_age  resp_edu  year_marr  caste  husb_edu poorest poorer middle  richer    religion hhd_gender place_residence hhd_age hh_size  wi1_age wi2_age wi3_age wi4_age   hedu_cas2   hedu_wi1 hedu_wi2 hedu_wi3 hedu_wi4  cas2_rel2    edu_yea rel2_wi1 rel2_wi2 rel2_wi3 rel2_wi4   dist1 dist2 dist3 dist4 dist5 dist6 dist7 dist8 dist9 dist10 dist11 dist12 dist13 dist14 dist15 dist16 dist17 dist18 dist19 dist20 dist21 dist22 dist23 dist24 dist25 dist26 dist27 dist28 dist29 dist30 dist31 dist32 dist33 dist34 dist35 dist36 dist37 dist38 dist39 dist40 dist41 dist42 dist43 dist44 dist45 dist46 dist47 dist48 dist49 dist50 dist51 dist52 dist53 dist54 dist55 dist56 dist57 dist58 dist59 dist60 dist61 dist62 dist62 ) report cluster(distyear) robust

diff phy_6_norm [aw=_weight] , p(year)  t(bihar)  cov(resp_age  resp_edu  year_marr  caste  husb_edu poorest poorer middle  richer    religion hhd_gender place_residence hhd_age hh_size  wi1_age wi2_age wi3_age wi4_age   hedu_cas2   hedu_wi1 hedu_wi2 hedu_wi3 hedu_wi4  cas2_rel2    edu_yea rel2_wi1 rel2_wi2 rel2_wi3 rel2_wi4   dist1 dist2 dist3 dist4 dist5 dist6 dist7 dist8 dist9 dist10 dist11 dist12 dist13 dist14 dist15 dist16 dist17 dist18 dist19 dist20 dist21 dist22 dist23 dist24 dist25 dist26 dist27 dist28 dist29 dist30 dist31 dist32 dist33 dist34 dist35 dist36 dist37 dist38 dist39 dist40 dist41 dist42 dist43 dist44 dist45 dist46 dist47 dist48 dist49 dist50 dist51 dist52 dist53 dist54 dist55 dist56 dist57 dist58 dist59 dist60 dist61 dist62 dist62 ) report cluster(distyear) robust

diff phy_7_norm [aw=_weight] , p(year)  t(bihar)  cov(resp_age  resp_edu  year_marr  caste  husb_edu poorest poorer middle  richer    religion hhd_gender place_residence hhd_age hh_size  wi1_age wi2_age wi3_age wi4_age   hedu_cas2   hedu_wi1 hedu_wi2 hedu_wi3 hedu_wi4  cas2_rel2    edu_yea rel2_wi1 rel2_wi2 rel2_wi3 rel2_wi4   dist1 dist2 dist3 dist4 dist5 dist6 dist7 dist8 dist9 dist10 dist11 dist12 dist13 dist14 dist15 dist16 dist17 dist18 dist19 dist20 dist21 dist22 dist23 dist24 dist25 dist26 dist27 dist28 dist29 dist30 dist31 dist32 dist33 dist34 dist35 dist36 dist37 dist38 dist39 dist40 dist41 dist42 dist43 dist44 dist45 dist46 dist47 dist48 dist49 dist50 dist51 dist52 dist53 dist54 dist55 dist56 dist57 dist58 dist59 dist60 dist61 dist62 dist62 ) report cluster(distyear) robust

diff sexual_1_norm [aw=_weight] , p(year)  t(bihar)  cov(resp_age  resp_edu  year_marr  caste  husb_edu poorest poorer middle  richer    religion hhd_gender place_residence hhd_age hh_size  wi1_age wi2_age wi3_age wi4_age   hedu_cas2   hedu_wi1 hedu_wi2 hedu_wi3 hedu_wi4  cas2_rel2    edu_yea rel2_wi1 rel2_wi2 rel2_wi3 rel2_wi4   dist1 dist2 dist3 dist4 dist5 dist6 dist7 dist8 dist9 dist10 dist11 dist12 dist13 dist14 dist15 dist16 dist17 dist18 dist19 dist20 dist21 dist22 dist23 dist24 dist25 dist26 dist27 dist28 dist29 dist30 dist31 dist32 dist33 dist34 dist35 dist36 dist37 dist38 dist39 dist40 dist41 dist42 dist43 dist44 dist45 dist46 dist47 dist48 dist49 dist50 dist51 dist52 dist53 dist54 dist55 dist56 dist57 dist58 dist59 dist60 dist61 dist62 dist62 ) report cluster(distyear) robust

diff sexual_2_norm [aw=_weight] , p(year)  t(bihar)  cov(resp_age  resp_edu  year_marr  caste  husb_edu poorest poorer middle  richer    religion hhd_gender place_residence hhd_age hh_size  wi1_age wi2_age wi3_age wi4_age   hedu_cas2   hedu_wi1 hedu_wi2 hedu_wi3 hedu_wi4  cas2_rel2    edu_yea rel2_wi1 rel2_wi2 rel2_wi3 rel2_wi4   dist1 dist2 dist3 dist4 dist5 dist6 dist7 dist8 dist9 dist10 dist11 dist12 dist13 dist14 dist15 dist16 dist17 dist18 dist19 dist20 dist21 dist22 dist23 dist24 dist25 dist26 dist27 dist28 dist29 dist30 dist31 dist32 dist33 dist34 dist35 dist36 dist37 dist38 dist39 dist40 dist41 dist42 dist43 dist44 dist45 dist46 dist47 dist48 dist49 dist50 dist51 dist52 dist53 dist54 dist55 dist56 dist57 dist58 dist59 dist60 dist61 dist62 dist62 ) report cluster(distyear) robust

diff sexual_3_norm [aw=_weight] , p(year)  t(bihar)  cov(resp_age  resp_edu  year_marr  caste  husb_edu poorest poorer middle  richer    religion hhd_gender place_residence hhd_age hh_size  wi1_age wi2_age wi3_age wi4_age   hedu_cas2   hedu_wi1 hedu_wi2 hedu_wi3 hedu_wi4  cas2_rel2    edu_yea rel2_wi1 rel2_wi2 rel2_wi3 rel2_wi4   dist1 dist2 dist3 dist4 dist5 dist6 dist7 dist8 dist9 dist10 dist11 dist12 dist13 dist14 dist15 dist16 dist17 dist18 dist19 dist20 dist21 dist22 dist23 dist24 dist25 dist26 dist27 dist28 dist29 dist30 dist31 dist32 dist33 dist34 dist35 dist36 dist37 dist38 dist39 dist40 dist41 dist42 dist43 dist44 dist45 dist46 dist47 dist48 dist49 dist50 dist51 dist52 dist53 dist54 dist55 dist56 dist57 dist58 dist59 dist60 dist61 dist62 dist62 ) report cluster(distyear) robust

diff control_1_norm [aw=_weight] , p(year)  t(bihar)  cov(resp_age  resp_edu  year_marr  caste  husb_edu poorest poorer middle  richer    religion hhd_gender place_residence hhd_age hh_size  wi1_age wi2_age wi3_age wi4_age   hedu_cas2   hedu_wi1 hedu_wi2 hedu_wi3 hedu_wi4  cas2_rel2    edu_yea rel2_wi1 rel2_wi2 rel2_wi3 rel2_wi4   dist1 dist2 dist3 dist4 dist5 dist6 dist7 dist8 dist9 dist10 dist11 dist12 dist13 dist14 dist15 dist16 dist17 dist18 dist19 dist20 dist21 dist22 dist23 dist24 dist25 dist26 dist27 dist28 dist29 dist30 dist31 dist32 dist33 dist34 dist35 dist36 dist37 dist38 dist39 dist40 dist41 dist42 dist43 dist44 dist45 dist46 dist47 dist48 dist49 dist50 dist51 dist52 dist53 dist54 dist55 dist56 dist57 dist58 dist59 dist60 dist61 dist62 dist62 ) report cluster(distyear) robust

diff control_2_norm [aw=_weight] , p(year)  t(bihar)  cov(resp_age  resp_edu  year_marr  caste  husb_edu poorest poorer middle  richer    religion hhd_gender place_residence hhd_age hh_size  wi1_age wi2_age wi3_age wi4_age   hedu_cas2   hedu_wi1 hedu_wi2 hedu_wi3 hedu_wi4  cas2_rel2    edu_yea rel2_wi1 rel2_wi2 rel2_wi3 rel2_wi4   dist1 dist2 dist3 dist4 dist5 dist6 dist7 dist8 dist9 dist10 dist11 dist12 dist13 dist14 dist15 dist16 dist17 dist18 dist19 dist20 dist21 dist22 dist23 dist24 dist25 dist26 dist27 dist28 dist29 dist30 dist31 dist32 dist33 dist34 dist35 dist36 dist37 dist38 dist39 dist40 dist41 dist42 dist43 dist44 dist45 dist46 dist47 dist48 dist49 dist50 dist51 dist52 dist53 dist54 dist55 dist56 dist57 dist58 dist59 dist60 dist61 dist62 dist62 ) report cluster(distyear) robust

diff control_3_norm [aw=_weight] , p(year)  t(bihar)  cov(resp_age  resp_edu  year_marr  caste  husb_edu poorest poorer middle  richer    religion hhd_gender place_residence hhd_age hh_size  wi1_age wi2_age wi3_age wi4_age   hedu_cas2   hedu_wi1 hedu_wi2 hedu_wi3 hedu_wi4  cas2_rel2    edu_yea rel2_wi1 rel2_wi2 rel2_wi3 rel2_wi4   dist1 dist2 dist3 dist4 dist5 dist6 dist7 dist8 dist9 dist10 dist11 dist12 dist13 dist14 dist15 dist16 dist17 dist18 dist19 dist20 dist21 dist22 dist23 dist24 dist25 dist26 dist27 dist28 dist29 dist30 dist31 dist32 dist33 dist34 dist35 dist36 dist37 dist38 dist39 dist40 dist41 dist42 dist43 dist44 dist45 dist46 dist47 dist48 dist49 dist50 dist51 dist52 dist53 dist54 dist55 dist56 dist57 dist58 dist59 dist60 dist61 dist62 dist62 ) report cluster(distyear) robust

diff control_4_norm [aw=_weight] , p(year)  t(bihar)  cov(resp_age  resp_edu  year_marr  caste  husb_edu poorest poorer middle  richer    religion hhd_gender place_residence hhd_age hh_size  wi1_age wi2_age wi3_age wi4_age   hedu_cas2   hedu_wi1 hedu_wi2 hedu_wi3 hedu_wi4  cas2_rel2    edu_yea rel2_wi1 rel2_wi2 rel2_wi3 rel2_wi4   dist1 dist2 dist3 dist4 dist5 dist6 dist7 dist8 dist9 dist10 dist11 dist12 dist13 dist14 dist15 dist16 dist17 dist18 dist19 dist20 dist21 dist22 dist23 dist24 dist25 dist26 dist27 dist28 dist29 dist30 dist31 dist32 dist33 dist34 dist35 dist36 dist37 dist38 dist39 dist40 dist41 dist42 dist43 dist44 dist45 dist46 dist47 dist48 dist49 dist50 dist51 dist52 dist53 dist54 dist55 dist56 dist57 dist58 dist59 dist60 dist61 dist62 dist62 ) report cluster(distyear) robust

diff control_5_norm [aw=_weight] , p(year)  t(bihar)  cov(resp_age  resp_edu  year_marr  caste  husb_edu poorest poorer middle  richer    religion hhd_gender place_residence hhd_age hh_size  wi1_age wi2_age wi3_age wi4_age   hedu_cas2   hedu_wi1 hedu_wi2 hedu_wi3 hedu_wi4  cas2_rel2    edu_yea rel2_wi1 rel2_wi2 rel2_wi3 rel2_wi4   dist1 dist2 dist3 dist4 dist5 dist6 dist7 dist8 dist9 dist10 dist11 dist12 dist13 dist14 dist15 dist16 dist17 dist18 dist19 dist20 dist21 dist22 dist23 dist24 dist25 dist26 dist27 dist28 dist29 dist30 dist31 dist32 dist33 dist34 dist35 dist36 dist37 dist38 dist39 dist40 dist41 dist42 dist43 dist44 dist45 dist46 dist47 dist48 dist49 dist50 dist51 dist52 dist53 dist54 dist55 dist56 dist57 dist58 dist59 dist60 dist61 dist62 dist62 ) report cluster(distyear) robust

diff control_6_norm [aw=_weight] , p(year)  t(bihar)  cov(resp_age  resp_edu  year_marr  caste  husb_edu poorest poorer middle  richer    religion hhd_gender place_residence hhd_age hh_size  wi1_age wi2_age wi3_age wi4_age   hedu_cas2   hedu_wi1 hedu_wi2 hedu_wi3 hedu_wi4  cas2_rel2    edu_yea rel2_wi1 rel2_wi2 rel2_wi3 rel2_wi4   dist1 dist2 dist3 dist4 dist5 dist6 dist7 dist8 dist9 dist10 dist11 dist12 dist13 dist14 dist15 dist16 dist17 dist18 dist19 dist20 dist21 dist22 dist23 dist24 dist25 dist26 dist27 dist28 dist29 dist30 dist31 dist32 dist33 dist34 dist35 dist36 dist37 dist38 dist39 dist40 dist41 dist42 dist43 dist44 dist45 dist46 dist47 dist48 dist49 dist50 dist51 dist52 dist53 dist54 dist55 dist56 dist57 dist58 dist59 dist60 dist61 dist62 dist62 ) report cluster(distyear) robust



/* TABLE 6 : SUB SAMPLE ANALYSIS (RURAL VS URBAN) */





* FROM THE PRACTISE FILE we first create two subsamples  - ONE OF RURAL AND OTHER OF URBAN PLACE OF RESIDENCE (look for variable v025 and place_residence )

* open rural dataset . 

* We standardize outcome variables separately for baseline and endline datasets 
* for baseline observations 
* mean and sd of outcome variables for control group (Jharkhand) at baseline 

summ emo1 emo2 emo3 phy1 phy2 phy3 phy4 phy5 phy6  phy7 sexual1 sexual2 sexual3 control1 control2 control3 control4 control5 control6 if bihar == 0 & year==0 


    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
        emo1 |      1,941    .0453375    .2080968          0          1
        emo2 |      1,941    .0448223    .2069669          0          1
        emo3 |      1,941    .0365791    .1877744          0          1
        phy1 |      1,941    .1014941    .3020597          0          1
        phy2 |      1,941    .2297785    .4207987          0          1
-------------+---------------------------------------------------------
        phy3 |      1,941    .0829469    .2758731          0          1
        phy4 |      1,941    .0638846    .2446103          0          1
        phy5 |      1,941      .01288    .1127857          0          1
        phy6 |      1,941    .0056672    .0750864          0          1
        phy7 |      1,941    .1025245     .303415          0          1
-------------+---------------------------------------------------------
     sexual1 |      1,941    .0463679    .2103347          0          1
     sexual2 |      1,941    .0164863    .1273692          0          1
     sexual3 |      1,941    .0303967    .1717205          0          1
    control1 |      1,939    .2851986    .4516254          0          1
    control2 |      1,940    .0835052    .2767156          0          1
-------------+---------------------------------------------------------
    control3 |      1,938    .3328173    .4713434          0          1
    control4 |      1,937    .1724316    .3778527          0          1
    control5 |      1,938    .2631579     .440461          0          1
    control6 |      1,936    .4194215    .4935919          0          1

. 



* standardize all outcome variables for all observations at baseline (bihar and Jharkhand)

gen emo_1_norm = (emo1 - .0453375)/.2080968  if year==0
gen emo_2_norm = (emo2 - .0448223)/.2069669  if year==0
gen emo_3_norm = (emo3 - .0365791 )/.1877744  if year==0

gen phy_1_norm = (phy1 - .1014941 )/.3020597 if year==0
gen phy_2_norm = (phy2 - .2297785  )/.4207987  if year==0
gen phy_3_norm = (phy3 - .0829469 )/.2758731   if year==0
gen phy_4_norm = (phy4 - .0638846)/.2446103  if year==0
gen phy_5_norm = (phy5 - .01288 )/.1127857  if year==0
gen phy_6_norm = (phy6 - .0056672  )/.0750864  if year==0
gen phy_7_norm = (phy7 - .1025245 )/.303415  if year==0

gen sexual_1_norm = (sexual1 - .0463679 )/.2103347 if year==0
gen sexual_2_norm = (sexual2 - .0164863 )/.1273692  if year==0 
gen sexual_3_norm = (sexual3 - .0303967 )/.1717205 if year==0  

gen control_1_norm = (control1 - .2851986  )/.4516254   if year==0
gen control_2_norm = (control2 - .0835052 )/.2767156 if year==0 
gen control_3_norm = (control3 - .3328173   )/.4713434 if year==0
gen control_4_norm = (control4 - .1724316  )/.3778527  if year==0
gen control_5_norm = (control5 - .2631579 )/ .440461 if year==0 
gen control_6_norm = (control6 -  .4194215)/ .4935919   if year==0


* for endline observations 
* mean and sd of outcome varibales for control group (Jharkhand) at endline
summ emo1 emo2 emo3 phy1 phy2 phy3 phy4 phy5 phy6  phy7 sexual1 sexual2 sexual3 control1 control2 control3 control4 control5 control6 if bihar == 0 & year==1


    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
        emo1 |      1,909    .0811943    .2732049          0          1
        emo2 |      1,909    .0743845    .2624643          0          1
        emo3 |      1,909    .0775275    .2674967          0          1
        phy1 |      1,909    .1435306     .350705          0          1
        phy2 |      1,909    .2587742    .4380761          0          1
-------------+---------------------------------------------------------
        phy3 |      1,909    .0921949    .2893766          0          1
        phy4 |      1,909    .0832897     .276392          0          1
        phy5 |      1,909    .0314301    .1745227          0          1
        phy6 |      1,909    .0298586    .1702416          0          1
        phy7 |      1,909    .1403876    .3474798          0          1
-------------+---------------------------------------------------------
     sexual1 |      1,909    .0529073     .223907          0          1
     sexual2 |      1,909    .0429544    .2028075          0          1
     sexual3 |      1,909    .0513358    .2207395          0          1
    control1 |      1,909    .3520168    .4777243          0          1
    control2 |      1,909    .1220534    .3274333          0          1
-------------+---------------------------------------------------------
    control3 |      1,909    .2624411    .4400764          0          1
    control4 |      1,908    .1944444    .3958762          0          1
    control5 |      1,909    .3027763    .4595797          0          1
    control6 |      1,908    .3249476     .468478          0          1

. 


* standardize all outcome variables for all observations at endline (bihar and Jharkhand)

replace emo_1_norm = (emo1 - .0811943 )/.2732049  if year==1
replace emo_2_norm = (emo2 - .0743845 )/.2624643  if year==1
replace emo_3_norm = (emo3 - .0775275 )/.2674967  if year==1

replace phy_1_norm = (phy1 - .1435306 )/.350705  if year==1
replace phy_2_norm = (phy2 - .2587742   )/.4380761   if year==1
replace phy_3_norm = (phy3 - .0921949  )/.2893766  if year==1
replace phy_4_norm = (phy4 - .0832897 )/.276392  if year==1
replace phy_5_norm = (phy5 - .0314301)/.1745227   if year==1
replace phy_6_norm = (phy6 - .0298586    )/.1702416   if year==1
replace phy_7_norm = (phy7 - .1403876 )/ .3474798 if year==1

replace sexual_1_norm = (sexual1 - .0529073)/.223907  if year==1
replace sexual_2_norm = (sexual2 - .0429544  )/.2028075   if year==1 
replace sexual_3_norm = (sexual3 - .0513358 )/.2207395   if year==1 

replace control_1_norm = (control1 - .3520168  )/.4777243   if year==1
replace control_2_norm = (control2 - .1220534 )/.3274333  if year==1 
replace control_3_norm = (control3 - .2624411  )/.4400764   if year==1
replace control_4_norm = (control4 - .1944444 )/.3958762  if year==1
replace control_5_norm = (control5 - .3027763 )/.4595797 if year==1 
replace control_6_norm = (control6 -  .3249476 )/.468478  if year==1


gen emo_sum = emo_1_norm + emo_2_norm + emo_3_norm
gen phy_sum = phy_1_norm + phy_2_norm + phy_3_norm + phy_4_norm + phy_5_norm + phy_6_norm + phy_7_norm
gen sexual_sum = sexual_1_norm + sexual_2_norm + sexual_3_norm
gen control_sum = control_1_norm + control_2_norm + control_3_norm + control_4_norm + control_5_norm + control_6_norm
gen ipv_sum = emo_1_norm + emo_2_norm + emo_3_norm + phy_1_norm + phy_2_norm + phy_3_norm + phy_4_norm + phy_5_norm + phy_6_norm + phy_7_norm + sexual_1_norm + sexual_2_norm + sexual_3_norm + control_1_norm + control_2_norm + control_3_norm + control_4_norm + control_5_norm + control_6_norm


* we standardize the outcome index again separately for baseline and endline datasets 
* for baseline observations 
* mean and sd of outcome index for control group (Jharkhand) at baseline 

sum emo_sum phy_sum sexual_sum control_sum ipv_sum if bihar == 0 & year == 0 


    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
     emo_sum |      1,941   -5.16e-07    2.401995  -.6292383   14.33345
     phy_sum |      1,941   -7.35e-07    4.618054  -1.971476   36.90873
  sexual_sum |      1,941    1.75e-07    2.389598  -.5268979   17.90204
 control_sum |      1,929    .0023105    3.685705   -3.54291   11.34958
     ipv_sum |      1,929    -.023957    9.007194  -6.670522   77.19798





* we standardize the index variables for all observations at baseline (bihar and Jharkhand)

gen emo_index_norm     = (emo_sum -( -5.16e-07))/2.401995  if year==0
gen phy_index_norm     = (phy_sum - (-7.35e-07))/4.618054  if year==0
gen sexual_index_norm  = (sexual_sum - ( 1.75e-07 ))/ 2.389598  if year==0
gen control_index_norm = (control_sum - (.0023105 ))/ 3.685705   if year==0
gen ipv_index_norm     = (ipv_sum - (-.023957 ))/ 9.007194   if year==0  



* for endline observations 
* mean and sd of outcome index for control group (Jharkhand) at endline 
sum emo_sum phy_sum sexual_sum control_sum ipv_sum if bihar == 0 & year == 1


    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
     emo_sum |      1,909    1.08e-07    2.550382  -.8704261   10.33824
     phy_sum |      1,909   -3.12e-07     5.13423  -2.379412   24.31023
  sexual_sum |      1,909   -4.56e-08    2.728052  -.6806531    13.2465
 control_sum |      1,907   -.0018797    3.912754  -3.549583   10.70658
     ipv_sum |      1,907   -.0004637    11.29996  -7.480074   58.60155




* we standardize the index variables for all observations at endline (bihar and Jharkhand)

replace emo_index_norm     = (emo_sum -(1.08e-07))/2.550382  if year==1
replace phy_index_norm     = (phy_sum - (-3.12e-07))/5.13423   if year==1
replace sexual_index_norm  = (sexual_sum - (-4.56e-08))/2.728052   if year==1
replace control_index_norm = (control_sum - ( -.0018797 ))/3.912754  if year==1
replace ipv_index_norm     = (ipv_sum - (-.0004637 ))/11.29996  if year==1   	 


* we have standardized all outcome variables .

* create quadraatic and interaction terms (remember interaction with pla2 will not be formed)

* we run the matching code 
* we first create a indicator for bihar pre observations. These observations become our treated group for matching purpose. 
* here we match rest three groups ( post bihar, pre jharkhand and post jharkhand) to pre bihar and create weights.
gen control_year = 1 if year == 0
replace control_year = 0 if year == 1
gen bihar_control = bihar * control_year 

* we remove place_residence (rural/urban) and its interaction term as control 

psmatch2 bihar_control resp_age  resp_edu  year_marr  caste  husb_edu poorest poorer middle  richer    religion hhd_gender  hhd_age hh_size  wi1_age wi2_age wi3_age wi4_age   hedu_cas2   hedu_wi1 hedu_wi2 hedu_wi3 hedu_wi4  cas2_rel2    edu_yea rel2_wi1 rel2_wi2 rel2_wi3 rel2_wi4   , kernel common 



diff emo_index_norm [aw=_weight] , p(year)  t(bihar)  cov(resp_age  resp_edu  year_marr  caste  husb_edu poorest poorer middle  richer    religion hhd_gender hhd_age hh_size  wi1_age wi2_age wi3_age wi4_age   hedu_cas2   hedu_wi1 hedu_wi2 hedu_wi3 hedu_wi4  cas2_rel2    edu_yea rel2_wi1 rel2_wi2 rel2_wi3 rel2_wi4   dist1 dist2 dist3 dist4 dist5 dist6 dist7 dist8 dist9 dist10 dist11 dist12 dist13 dist14 dist15 dist16 dist17 dist18 dist19 dist20 dist21 dist22 dist23 dist24 dist25 dist26 dist27 dist28 dist29 dist30 dist31 dist32 dist33 dist34 dist35 dist36 dist37 dist38 dist39 dist40 dist41 dist42 dist43 dist44 dist45 dist46 dist47 dist48 dist49 dist50 dist51 dist52 dist53 dist54 dist55 dist56 dist57 dist58 dist59 dist60 dist61 dist62 dist62 ) report cluster(distyear) robust

* similarly for rest 4 outcome variables



* open urban dataset
* We standardize outcome variables separately for baseline and endline datasets 
* for baseline observations 
* mean and sd of outcome variables for control group (Jharkhand) at baseline 

summ emo1 emo2 emo3 phy1 phy2 phy3 phy4 phy5 phy6  phy7 sexual1 sexual2 sexual3 control1 control2 control3 control4 control5 control6 if bihar == 0 & year==0 


    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
        emo1 |        583    .0274443    .1635143          0          1
        emo2 |        583    .0188679    .1361754          0          1
        emo3 |        583    .0291595     .168398          0          1
        phy1 |        583    .0377358    .1907203          0          1
        phy2 |        583    .1200686    .3253209          0          1
-------------+---------------------------------------------------------
        phy3 |        583    .0428816    .2027642          0          1
        phy4 |        583    .0154374    .1233904          0          1
        phy5 |        583    .0051458    .0716109          0          1
        phy6 |        583    .0051458    .0716109          0          1
        phy7 |        583    .0428816    .2027642          0          1
-------------+---------------------------------------------------------
     sexual1 |        583    .0428816    .2027642          0          1
     sexual2 |        583    .0222985    .1477792          0          1
     sexual3 |        583    .0240137    .1532231          0          1
    control1 |        583    .2041166    .4034007          0          1
    control2 |        583    .0497427    .2175996          0          1
-------------+---------------------------------------------------------
    control3 |        583    .2778731    .4483351          0          1
    control4 |        583    .1543739    .3616171          0          1
    control5 |        583    .1749571    .3802567          0          1
    control6 |        583    .3773585     .485142          0          1

. 
* standardize all outcome variables for all observations at baseline (bihar and Jharkhand)

gen emo_1_norm = (emo1 - .0274443)/.1635143   if year==0
gen emo_2_norm = (emo2 - .0188679)/.1361754  if year==0
gen emo_3_norm = (emo3 - .0291595  )/ .168398  if year==0

gen phy_1_norm = (phy1 - .0377358 )/.1907203 if year==0
gen phy_2_norm = (phy2 - .1200686  )/.3253209  if year==0
gen phy_3_norm = (phy3 - .0428816 )/.2027642  if year==0
gen phy_4_norm = (phy4 - .0154374 )/.1233904    if year==0
gen phy_5_norm = (phy5 - .0051458  )/.0716109   if year==0
gen phy_6_norm = (phy6 - .0051458  )/.0716109   if year==0
gen phy_7_norm = (phy7 - .0428816  )/.2027642   if year==0

gen sexual_1_norm = (sexual1 - .0428816 )/.2027642  if year==0
gen sexual_2_norm = (sexual2 - .0222985 )/.1477792   if year==0 
gen sexual_3_norm = (sexual3 - .0240137  )/.1532231  if year==0  

gen control_1_norm = (control1 - .2041166  )/.4034007    if year==0
gen control_2_norm = (control2 - .0497427  )/.2175996  if year==0 
gen control_3_norm = (control3 - .2778731  )/.4483351  if year==0
gen control_4_norm = (control4 - .1543739   )/.3616171  if year==0
gen control_5_norm = (control5 - .1749571 )/ .3802567 if year==0 
gen control_6_norm = (control6 -  .3773585 )/ .485142    if year==0


* for endline observations 
* mean and sd of outcome varibales for control group (Jharkhand) at endline
summ emo1 emo2 emo3 phy1 phy2 phy3 phy4 phy5 phy6  phy7 sexual1 sexual2 sexual3 control1 control2 control3 control4 control5 control6 if bihar == 0 & year==1

    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
        emo1 |        430    .0348837     .183699          0          1
        emo2 |        430    .0302326    .1714261          0          1
        emo3 |        430    .0325581    .1776838          0          1
        phy1 |        430    .0744186    .2627566          0          1
        phy2 |        430    .2116279    .4089382          0          1
-------------+---------------------------------------------------------
        phy3 |        430    .0534884    .2252673          0          1
        phy4 |        430    .0325581    .1776838          0          1
        phy5 |        430    .0069767     .083332          0          1
        phy6 |        430    .0069767     .083332          0          1
        phy7 |        430    .0674419    .2510778          0          1
-------------+---------------------------------------------------------
     sexual1 |        430    .0232558    .1508905          0          1
     sexual2 |        430    .0139535    .1174345          0          1
     sexual3 |        430    .0209302    .1433176          0          1
    control1 |        429    .3333333    .4719549          0          1
    control2 |        430    .0697674    .2550514          0          1
-------------+---------------------------------------------------------
    control3 |        429    .2587413    .4384545          0          1
    control4 |        429    .1445221    .3520289          0          1
    control5 |        430    .2813953    .4502037          0          1
    control6 |        430    .2953488    .4567308          0          1

. 


* standardize all outcome variables for all observations at endline (bihar and Jharkhand)

replace emo_1_norm = (emo1 - .0348837 )/.183699   if year==1
replace emo_2_norm = (emo2 - .0302326  )/.1714261   if year==1
replace emo_3_norm = (emo3 - .0325581 )/.1776838    if year==1

replace phy_1_norm = (phy1 - .0744186 )/.2627566   if year==1
replace phy_2_norm = (phy2 - .2116279  )/.4089382   if year==1
replace phy_3_norm = (phy3 - .0534884  )/.2252673    if year==1
replace phy_4_norm = (phy4 - .0325581 )/ .1776838   if year==1
replace phy_5_norm = (phy5 -  .0069767)/.083332    if year==1
replace phy_6_norm = (phy6 - .0069767   )/.083332   if year==1
replace phy_7_norm = (phy7 -  .0674419  )/.2510778  if year==1

replace sexual_1_norm = (sexual1 - .0232558)/.1508905  if year==1
replace sexual_2_norm = (sexual2 - .0139535  )/.1174345    if year==1 
replace sexual_3_norm = (sexual3 - .0209302  )/.1433176    if year==1 

replace control_1_norm = (control1 - .3333333 )/.4719549   if year==1
replace control_2_norm = (control2 -  .0697674  )/.2550514   if year==1 
replace control_3_norm = (control3 -  .2587413 )/ .4384545    if year==1
replace control_4_norm = (control4 - .1445221 )/.3520289  if year==1
replace control_5_norm = (control5 - .2813953  )/.4502037  if year==1 
replace control_6_norm = (control6 -  .2953488 )/ .4567308   if year==1

gen emo_sum = emo_1_norm + emo_2_norm + emo_3_norm
gen phy_sum = phy_1_norm + phy_2_norm + phy_3_norm + phy_4_norm + phy_5_norm + phy_6_norm + phy_7_norm
gen sexual_sum = sexual_1_norm + sexual_2_norm + sexual_3_norm
gen control_sum = control_1_norm + control_2_norm + control_3_norm + control_4_norm + control_5_norm + control_6_norm
gen ipv_sum = emo_1_norm + emo_2_norm + emo_3_norm + phy_1_norm + phy_2_norm + phy_3_norm + phy_4_norm + phy_5_norm + phy_6_norm + phy_7_norm + sexual_1_norm + sexual_2_norm + sexual_3_norm + control_1_norm + control_2_norm + control_3_norm + control_4_norm + control_5_norm + control_6_norm


* we standardize the outcome index again separately for baseline and endline datasets 
* for baseline observations 
* mean and sd of outcome index for control group (Jharkhand) at baseline 

sum emo_sum phy_sum sexual_sum control_sum ipv_sum if bihar == 0 & year == 0 


    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
     emo_sum |        583    3.99e-09    2.433153  -.4795545    18.9179
     phy_sum |        583    6.71e-07    4.981138  -1.258733   52.95518
  sexual_sum |        583    7.77e-08     2.68163  -.5190995   17.70602
 control_sum |        583    1.37e-07    3.689194  -3.019208    13.7422
     ipv_sum |        583    8.93e-07    9.971999  -5.276595   98.46102




* we standardize the index variables for all observations at baseline (bihar and Jharkhand)

gen emo_index_norm     = (emo_sum -(  3.99e-09))/2.433153  if year==0
gen phy_index_norm     = (phy_sum - (6.71e-07))/4.981138  if year==0
gen sexual_index_norm  = (sexual_sum - (  7.77e-08))/  2.68163  if year==0
gen control_index_norm = (control_sum - (1.37e-07 ))/ 3.689194   if year==0
gen ipv_index_norm     = (ipv_sum - (8.93e-07 ))/ 9.971999   if year==0  

* for endline observations 
* mean and sd of outcome index for control group (Jharkhand) at endline 
sum emo_sum phy_sum sexual_sum control_sum ipv_sum if bihar == 0 & year == 1


    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
     emo_sum |        430    9.26e-08    2.522893  -.5494915   16.35559
     phy_sum |        430    1.00e-06    5.000756  -1.657462   42.64406
  sexual_sum |        430    2.24e-07    2.420448  -.4189838   21.70123
 control_sum |        428   -.0196807    3.823232  -3.252185   12.31954
     ipv_sum |        428   -.0074087    9.930291  -5.878122   81.78876




* we standardize the index variables for all observations at endline (bihar and Jharkhand)

replace emo_index_norm     = (emo_sum -(9.26e-08))/ 2.522893  if year==1
replace phy_index_norm     = (phy_sum - (1.00e-06 ))/5.000756   if year==1
replace sexual_index_norm  = (sexual_sum - ( 2.24e-07))/2.420448   if year==1
replace control_index_norm = (control_sum - ( -.0196807  ))/3.823232  if year==1
replace ipv_index_norm     = (ipv_sum - (-.0074087  ))/9.930291  if year==1   	 


* we have standardized all outcome variables .

* create quadratic and interaction term 

* we run the matching code 
* we first create a indicator for bihar pre observations. These observations become our treated group for matching purpose. 
* here we match rest three groups ( post bihar, pre jharkhand and post jharkhand) to pre bihar and create weights.
gen control_year = 1 if year == 0
replace control_year = 0 if year == 1
gen bihar_control = bihar * control_year 

* we remove place_residence (rural/urban) as control 

psmatch2 bihar_control resp_age  resp_edu  year_marr  caste  husb_edu poorest poorer middle  richer    religion hhd_gender  hhd_age hh_size  wi1_age wi2_age wi3_age wi4_age   hedu_cas2   hedu_wi1 hedu_wi2 hedu_wi3 hedu_wi4  cas2_rel2    edu_yea rel2_wi1 rel2_wi2 rel2_wi3 rel2_wi4   , kernel common 



diff emo_index_norm [aw=_weight] , p(year)  t(bihar)  cov(resp_age  resp_edu  year_marr  caste  husb_edu poorest poorer middle  richer    religion hhd_gender hhd_age hh_size  wi1_age wi2_age wi3_age wi4_age   hedu_cas2   hedu_wi1 hedu_wi2 hedu_wi3 hedu_wi4  cas2_rel2    edu_yea rel2_wi1 rel2_wi2 rel2_wi3 rel2_wi4   dist1 dist2 dist3 dist4 dist5 dist6 dist7 dist8 dist9 dist10 dist11 dist12 dist13 dist14 dist15 dist16 dist17 dist18 dist19 dist20 dist21 dist22 dist23 dist24 dist25 dist26 dist27 dist28 dist29 dist30 dist31 dist32 dist33 dist34 dist35 dist36 dist37 dist38 dist39 dist40 dist41 dist42 dist43 dist44 dist45 dist46 dist47 dist48 dist49 dist50 dist51 dist52 dist53 dist54 dist55 dist56 dist57 dist58 dist59 dist60 dist61 dist62 dist62 ) report cluster(distyear) robust

* similarly for rest 4 outcome variables



/* TABLE 7: Robustness Check: Estimates using pre-Covid data */



/* we create a copy of practise file */ 

/* We first drop all observations interviewed after March 2020. We are left with 9 districts in jharkhand in the post round. Keep in mind that even in these districts some observations were after March 2020. But they are dropped.  */

drop if v007==2021
drop if v006>3 & v007==2020



/*we then drop 15 districts from jharkhand from both rounds that were interviewed after March 2020 in the second round */

drop if district==346 | district==347 | district==348 | district==352 | district==353 | district==356 |district==357 | district==358 | district==359 |district==364 | district==365 | district==366 |district==367 | district==368 | district==369

/* so we only keep all observations from 9 districts in the pre round but keep only those observations from these 9 districts that were interviewed before April 2020 in the post round  */

* we standardize the outcome variables 

* We standardize outcome variables separately for baseline and endline datasets 
* for baseline observations 
* mean and sd of outcome variables for control group (Jharkhand) at baseline 
summ emo1 emo2 emo3 phy1 phy2 phy3 phy4 phy5 phy6  phy7 sexual1 sexual2 sexual3 control1 control2 control3 control4 control5 control6 if bihar == 0 & year==0 


    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
        emo1 |      1,109    .0351668    .1842844          0          1
        emo2 |      1,109    .0333634    .1796646          0          1
        emo3 |      1,109    .0351668    .1842844          0          1
        phy1 |      1,109    .0856628    .2799917          0          1
        phy2 |      1,109    .2064923    .4049705          0          1
-------------+---------------------------------------------------------
        phy3 |      1,109    .0658251    .2480879          0          1
        phy4 |      1,109    .0351668    .1842844          0          1
        phy5 |      1,109    .0081154    .0897598          0          1
        phy6 |      1,109    .0072137    .0846648          0          1
        phy7 |      1,109    .0874662    .2826445          0          1
-------------+---------------------------------------------------------
     sexual1 |      1,109    .0423805    .2015466          0          1
     sexual2 |      1,109    .0216411    .1455743          0          1
     sexual3 |      1,109    .0333634    .1796646          0          1
    control1 |      1,109    .2723174    .4453532          0          1
    control2 |      1,109    .0784491    .2689982          0          1
-------------+---------------------------------------------------------
    control3 |      1,108    .2951264    .4563055          0          1
    control4 |      1,108    .1624549    .3690341          0          1
    control5 |      1,108    .2229242    .4163958          0          1
    control6 |      1,109    .3769161    .4848323          0          1

. 

* standardize all outcome variables for all observations at baseline (bihar and Jharkhand)

gen emo_1_norm = (emo1 - .0351668 )/.1842844  if year==0
gen emo_2_norm = (emo2 - .0333634)/.1796646   if year==0
gen emo_3_norm = (emo3 - .0351668 )/.1842844  if year==0

gen phy_1_norm = (phy1 - .0856628)/.2799917  if year==0
gen phy_2_norm = (phy2 - .2064923 )/.4049705   if year==0
gen phy_3_norm = (phy3 - .0658251  )/.2480879  if year==0
gen phy_4_norm = (phy4 - .0351668 )/.1842844  if year==0
gen phy_5_norm = (phy5 - .0081154)/.0897598  if year==0
gen phy_6_norm = (phy6 - .0072137  )/.0846648  if year==0
gen phy_7_norm = (phy7 - .0874662  )/.2826445   if year==0

gen sexual_1_norm = (sexual1 -  .0423805  )/.2015466  if year==0
gen sexual_2_norm = (sexual2 - .0216411  )/ .1455743 if year==0 
gen sexual_3_norm = (sexual3 - .0333634 )/ .1796646  if year==0  

gen control_1_norm = (control1 - .2723174 )/ .4453532  if year==0
gen control_2_norm = (control2 - .0784491 )/.2689982  if year==0 
gen control_3_norm = (control3 - .2951264  )/ .4563055  if year==0
gen control_4_norm = (control4 - .1624549 )/.3690341   if year==0
gen control_5_norm = (control5 - .2229242)/.4163958   if year==0 
gen control_6_norm = (control6 -   .3769161)/.4848323   if year==0


* for endline observations 
* mean and sd of outcome varibales for control group (Jharkhand) at endline
summ emo1 emo2 emo3 phy1 phy2 phy3 phy4 phy5 phy6  phy7 sexual1 sexual2 sexual3 control1 control2 control3 control4 control5 control6 if bihar == 0 & year==1


    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
        emo1 |        709     .100141    .3003999          0          1
        emo2 |        709    .0846262    .2785212          0          1
        emo3 |        709    .0930889     .290762          0          1
        phy1 |        709    .1551481    .3623014          0          1
        phy2 |        709    .3060649    .4611823          0          1
-------------+---------------------------------------------------------
        phy3 |        709    .0987306    .2985106          0          1
        phy4 |        709     .077574     .267689          0          1
        phy5 |        709    .0239774    .1530868          0          1
        phy6 |        709    .0169252    .1290825          0          1
        phy7 |        709    .1622003    .3688947          0          1
-------------+---------------------------------------------------------
     sexual1 |        709    .0662906    .2489649          0          1
     sexual2 |        709    .0394922    .1949004          0          1
     sexual3 |        709    .0620592    .2414334          0          1
    control1 |        708    .4039548    .4910355          0          1
    control2 |        709    .1241185    .3299494          0          1
-------------+---------------------------------------------------------
    control3 |        709    .2849083    .4516894          0          1
    control4 |        708    .1864407    .3897372          0          1
    control5 |        709    .3794076    .4855822          0          1
    control6 |        709     .403385    .4909231          0          1

. 

* standardize all outcome variables for all observations at endline (bihar and Jharkhand)

replace emo_1_norm = (emo1 - .100141 )/.3003999  if year==1
replace emo_2_norm = (emo2 - .0846262)/.2785212 if year==1
replace emo_3_norm = (emo3 - .0930889)/.290762   if year==1

replace phy_1_norm = (phy1 - .1551481 )/.3623014   if year==1
replace phy_2_norm = (phy2 - .3060649  )/.4611823    if year==1
replace phy_3_norm = (phy3 - .0987306 )/ .2985106  if year==1
replace phy_4_norm = (phy4 - .077574 )/.267689 if year==1
replace phy_5_norm = (phy5 - .0239774 )/.1530868   if year==1
replace phy_6_norm = (phy6 -  .0169252  )/.1290825  if year==1
replace phy_7_norm = (phy7 - .1622003)/.3688947 if year==1

replace sexual_1_norm = (sexual1 - .0662906 )/.2489649  if year==1
replace sexual_2_norm = (sexual2 -  .0394922  )/.1949004  if year==1 
replace sexual_3_norm = (sexual3 - .0620592)/.2414334  if year==1 

replace control_1_norm = (control1 - .4039548   )/.4910355  if year==1
replace control_2_norm = (control2 - .1241185 )/ .3299494   if year==1 
replace control_3_norm = (control3 - .2849083  )/.4516894 if year==1
replace control_4_norm = (control4 - .1864407 )/.3897372   if year==1
replace control_5_norm = (control5 - .3794076)/ .4855822  if year==1 
replace control_6_norm = (control6 -  .403385)/.4909231   if year==1



gen emo_sum = emo_1_norm + emo_2_norm + emo_3_norm
gen phy_sum = phy_1_norm + phy_2_norm + phy_3_norm + phy_4_norm + phy_5_norm + phy_6_norm + phy_7_norm
gen sexual_sum = sexual_1_norm + sexual_2_norm + sexual_3_norm
gen control_sum = control_1_norm + control_2_norm + control_3_norm + control_4_norm + control_5_norm + control_6_norm
gen ipv_sum = emo_1_norm + emo_2_norm + emo_3_norm + phy_1_norm + phy_2_norm + phy_3_norm + phy_4_norm + phy_5_norm + phy_6_norm + phy_7_norm + sexual_1_norm + sexual_2_norm + sexual_3_norm + control_1_norm + control_2_norm + control_3_norm + control_4_norm + control_5_norm + control_6_norm


* we standardize the outcome index again separately for baseline and endline datasets 
* for baseline observations 
* mean and sd of outcome index for control group (Jharkhand) at baseline 

sum emo_sum phy_sum sexual_sum control_sum ipv_sum if bihar == 0 & year == 0 

    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
     emo_sum |      1,109    9.41e-08    2.442591  -.5673562   15.85136
     phy_sum |      1,109    1.56e-07    4.607751  -1.757073   40.23114
  sexual_sum |      1,109    2.10e-07    2.594217  -.5446348   16.85227
 control_sum |      1,107    .0020536    3.729692   -3.30287   12.02546
     ipv_sum |      1,107    .0072374    9.461497  -6.171934   80.36716



* we standardize the index variables for all observations at baseline (bihar and Jharkhand)

gen emo_index_norm     = (emo_sum -( 9.41e-08))/2.442591  if year==0
gen phy_index_norm     = (phy_sum - (1.56e-07))/4.607751  if year==0
gen sexual_index_norm  = (sexual_sum - (2.10e-07))/ 2.594217  if year==0
gen control_index_norm = (control_sum - (.0020536))/3.729692   if year==0
gen ipv_index_norm     = (ipv_sum - (.0072374))/9.461497  if year==0   	 


* for endline observations 
* mean and sd of outcome index for control group (Jharkhand) at endline 
sum emo_sum phy_sum sexual_sum control_sum ipv_sum if bihar == 0 & year == 1


    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
     emo_sum |        709    1.19e-07    2.568251  -.9573551   9.401171
     phy_sum |        709    6.18e-07      4.7594  -2.439856   26.56429
  sexual_sum |        709    1.66e-07    2.592671  -.7259372   12.56345
 control_sum |        707    -.009782    3.820178  -3.911003   10.03238
     ipv_sum |        707    .0018827    10.55533  -8.034151   58.56129



* we standardize the index variables for all observations at endline (bihar and Jharkhand)

replace emo_index_norm     = (emo_sum -(1.19e-07 ))/2.568251  if year==1
replace phy_index_norm     = (phy_sum - (6.18e-07))/4.7594  if year==1
replace sexual_index_norm  = (sexual_sum - (1.66e-07))/2.592671  if year==1
replace control_index_norm = (control_sum - (-.009782))/3.820178   if year==1
replace ipv_index_norm     = (ipv_sum - (.0018827 ))/10.55533  if year==1   	 


* we have standardized all outcome variables .




* generate all non linear terms  

* generate control_year  (indicator for bihar pre observations). 


gen control_year = 1 if year == 0
replace control_year = 0 if year == 1
gen bihar_control = bihar * control_year



* we now match (here bihar_control takes a value of 1 for pre bihar observations)
* we remove base dummies to address dummy trap 

psmatch2 bihar_control resp_age  resp_edu  year_marr  caste  husb_edu poorest poorer middle  richer    religion hhd_gender place_residence hhd_age hh_size  wi1_age wi2_age wi3_age wi4_age   hedu_cas2   hedu_wi1 hedu_wi2 hedu_wi3 hedu_wi4  cas2_rel2    edu_yea rel2_wi1 rel2_wi2 rel2_wi3 rel2_wi4   , kernel common 

* now run the DID estimation 

diff emo_index_norm [aw=_weight] , p(year)  t(bihar)  cov(resp_age  resp_edu  year_marr  caste  husb_edu poorest poorer middle  richer    religion hhd_gender place_residence hhd_age hh_size  wi1_age wi2_age wi3_age wi4_age   hedu_cas2   hedu_wi1 hedu_wi2 hedu_wi3 hedu_wi4  cas2_rel2    edu_yea rel2_wi1 rel2_wi2 rel2_wi3 rel2_wi4   dist1 dist2 dist3 dist4 dist5 dist6 dist7 dist8 dist9 dist10 dist11 dist12 dist13 dist14 dist15 dist16 dist17 dist18 dist19 dist20 dist21 dist22 dist23 dist24 dist25 dist26 dist27 dist28 dist29 dist30 dist31 dist32 dist33 dist34 dist35 dist36 dist37 dist38 dist39 dist40 dist41 dist42 dist43 dist44 dist45 dist46 dist47 dist48 dist49 dist50 dist51 dist52 dist53 dist54 dist55 dist56 dist57 dist58 dist59 dist60 dist61 dist62 dist62 ) report cluster(distyear) robust   


diff phy_index_norm [aw=_weight] , p(year)  t(bihar)  cov(resp_age  resp_edu  year_marr  caste  husb_edu poorest poorer middle  richer    religion hhd_gender place_residence hhd_age hh_size  wi1_age wi2_age wi3_age wi4_age   hedu_cas2   hedu_wi1 hedu_wi2 hedu_wi3 hedu_wi4  cas2_rel2    edu_yea rel2_wi1 rel2_wi2 rel2_wi3 rel2_wi4   dist1 dist2 dist3 dist4 dist5 dist6 dist7 dist8 dist9 dist10 dist11 dist12 dist13 dist14 dist15 dist16 dist17 dist18 dist19 dist20 dist21 dist22 dist23 dist24 dist25 dist26 dist27 dist28 dist29 dist30 dist31 dist32 dist33 dist34 dist35 dist36 dist37 dist38 dist39 dist40 dist41 dist42 dist43 dist44 dist45 dist46 dist47 dist48 dist49 dist50 dist51 dist52 dist53 dist54 dist55 dist56 dist57 dist58 dist59 dist60 dist61 dist62 dist62 ) report cluster(distyear) robust

diff sexual_index_norm [aw=_weight] , p(year)  t(bihar)  cov(resp_age  resp_edu  year_marr  caste  husb_edu poorest poorer middle  richer    religion hhd_gender place_residence hhd_age hh_size  wi1_age wi2_age wi3_age wi4_age   hedu_cas2   hedu_wi1 hedu_wi2 hedu_wi3 hedu_wi4  cas2_rel2    edu_yea rel2_wi1 rel2_wi2 rel2_wi3 rel2_wi4   dist1 dist2 dist3 dist4 dist5 dist6 dist7 dist8 dist9 dist10 dist11 dist12 dist13 dist14 dist15 dist16 dist17 dist18 dist19 dist20 dist21 dist22 dist23 dist24 dist25 dist26 dist27 dist28 dist29 dist30 dist31 dist32 dist33 dist34 dist35 dist36 dist37 dist38 dist39 dist40 dist41 dist42 dist43 dist44 dist45 dist46 dist47 dist48 dist49 dist50 dist51 dist52 dist53 dist54 dist55 dist56 dist57 dist58 dist59 dist60 dist61 dist62 dist62 ) report cluster(distyear) robust 

diff control_index_norm [aw=_weight] , p(year)  t(bihar)  cov(resp_age  resp_edu  year_marr  caste  husb_edu poorest poorer middle  richer    religion hhd_gender place_residence hhd_age hh_size  wi1_age wi2_age wi3_age wi4_age   hedu_cas2   hedu_wi1 hedu_wi2 hedu_wi3 hedu_wi4  cas2_rel2    edu_yea rel2_wi1 rel2_wi2 rel2_wi3 rel2_wi4   dist1 dist2 dist3 dist4 dist5 dist6 dist7 dist8 dist9 dist10 dist11 dist12 dist13 dist14 dist15 dist16 dist17 dist18 dist19 dist20 dist21 dist22 dist23 dist24 dist25 dist26 dist27 dist28 dist29 dist30 dist31 dist32 dist33 dist34 dist35 dist36 dist37 dist38 dist39 dist40 dist41 dist42 dist43 dist44 dist45 dist46 dist47 dist48 dist49 dist50 dist51 dist52 dist53 dist54 dist55 dist56 dist57 dist58 dist59 dist60 dist61 dist62 dist62 ) report cluster(distyear) robust

diff ipv_index_norm [aw=_weight] , p(year)  t(bihar)  cov(resp_age  resp_edu  year_marr  caste  husb_edu poorest poorer middle  richer    religion hhd_gender place_residence hhd_age hh_size  wi1_age wi2_age wi3_age wi4_age   hedu_cas2   hedu_wi1 hedu_wi2 hedu_wi3 hedu_wi4  cas2_rel2    edu_yea rel2_wi1 rel2_wi2 rel2_wi3 rel2_wi4   dist1 dist2 dist3 dist4 dist5 dist6 dist7 dist8 dist9 dist10 dist11 dist12 dist13 dist14 dist15 dist16 dist17 dist18 dist19 dist20 dist21 dist22 dist23 dist24 dist25 dist26 dist27 dist28 dist29 dist30 dist31 dist32 dist33 dist34 dist35 dist36 dist37 dist38 dist39 dist40 dist41 dist42 dist43 dist44 dist45 dist46 dist47 dist48 dist49 dist50 dist51 dist52 dist53 dist54 dist55 dist56 dist57 dist58 dist59 dist60 dist61 dist62 dist62 ) report cluster(distyear) robust



/*  TABLE 9  UNINTERRUPTED INTERVIEWS */

* OPEN THE PRACTISE DATASET

* we first create interrupt dataset with observaitions that have not been interrupted by an adult of the house

gen interrupt = d122a+d122b+d122c

keep if interrupt==0



* We standardize outcome variables separately for baseline and endline datasets 
* for baseline observations 
* mean and sd of outcome variables for control group (Jharkhand) at baseline 



summ emo1 emo2 emo3 phy1 phy2 phy3 phy4 phy5 phy6  phy7 sexual1 sexual2 sexual3 control1 control2 control3 control4 control5 control6 if bihar == 0 & year==0 


    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
        emo1 |      2,114     .037843    .1908615          0          1
        emo2 |      2,114    .0350047    .1838352          0          1
        emo3 |      2,114    .0307474    .1726734          0          1
        phy1 |      2,114    .0804163    .2720009          0          1
        phy2 |      2,114    .2005676    .4005198          0          1
-------------+---------------------------------------------------------
        phy3 |      2,114    .0709555    .2568113          0          1
        phy4 |      2,114    .0482498    .2143442          0          1
        phy5 |      2,114    .0070956    .0839556          0          1
        phy6 |      2,114    .0023652    .0485871          0          1
        phy7 |      2,114    .0823084    .2748991          0          1
-------------+---------------------------------------------------------
     sexual1 |      2,114    .0435194    .2040714          0          1
     sexual2 |      2,114    .0160833    .1258256          0          1
     sexual3 |      2,114     .026017    .1592235          0          1
    control1 |      2,112    .2736742    .4459493          0          1
    control2 |      2,113    .0681496     .252062          0          1
-------------+---------------------------------------------------------
    control3 |      2,112    .3181818    .4658808          0          1
    control4 |      2,111    .1672193      .37326          0          1
    control5 |      2,111    .2468025     .431253          0          1
    control6 |      2,110    .4109005    .4921139          0          1

. 


.* standardize all outcome variables for all observations at baseline (bihar and Jharkhand)

gen emo_1_norm = (emo1 - .037843)/.1908615   if year==0
gen emo_2_norm = (emo2 - .0350047)/.1838352  if year==0
gen emo_3_norm = (emo3 - .0307474 )/ .1726734   if year==0

gen phy_1_norm = (phy1 - .0804163 )/.2720009  if year==0
gen phy_2_norm = (phy2 - .2005676  )/.4005198   if year==0
gen phy_3_norm = (phy3 - .0709555  )/.2568113   if year==0
gen phy_4_norm = (phy4 - .0482498 )/.2143442    if year==0
gen phy_5_norm = (phy5 - .0070956  )/.0839556    if year==0
gen phy_6_norm = (phy6 - .0023652  )/.0485871 if year==0
gen phy_7_norm = (phy7 - .0823084 )/.2748991    if year==0

gen sexual_1_norm = (sexual1 - .0435194  )/.2040714   if year==0
gen sexual_2_norm = (sexual2 - .0160833  )/ .1258256  if year==0 
gen sexual_3_norm = (sexual3 -  .026017   )/.1592235  if year==0  

gen control_1_norm = (control1 - .2736742   )/.4459493    if year==0
gen control_2_norm = (control2 - .0681496  )/.252062  if year==0 
gen control_3_norm = (control3 - .3181818  )/.4658808   if year==0
gen control_4_norm = (control4 - .1672193  )/.37326   if year==0
gen control_5_norm = (control5 - .2468025  )/ .431253  if year==0 
gen control_6_norm = (control6 -  .4109005  )/ .4921139   if year==0

* for endline observations 
* mean and sd of outcome varibales for control group (Jharkhand) at endline
summ emo1 emo2 emo3 phy1 phy2 phy3 phy4 phy5 phy6  phy7 sexual1 sexual2 sexual3 control1 control2 control3 control4 control5 control6 if bihar == 0 & year==1


    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
        emo1 |      1,909    .0691461     .253769          0          1
        emo2 |      1,909    .0628601    .2427748          0          1
        emo3 |      1,909    .0686223     .252877          0          1
        phy1 |      1,909    .1257203    .3316207          0          1
        phy2 |      1,909    .2221058    .4157708          0          1
-------------+---------------------------------------------------------
        phy3 |      1,909    .0811943    .2732049          0          1
        phy4 |      1,909    .0749083    .2633123          0          1
        phy5 |      1,909    .0267156    .1612931          0          1
        phy6 |      1,909    .0240964    .1533886          0          1
        phy7 |      1,909    .1199581    .3249977          0          1
-------------+---------------------------------------------------------
     sexual1 |      1,909    .0471451    .2120047          0          1
     sexual2 |      1,909    .0382399    .1918252          0          1
     sexual3 |      1,909    .0455736    .2086131          0          1
    control1 |      1,908    .3370021    .4728095          0          1
    control2 |      1,909    .1000524    .3001485          0          1
-------------+---------------------------------------------------------
    control3 |      1,908    .2536688    .4352243          0          1
    control4 |      1,907    .1814368    .3854808          0          1
    control5 |      1,909    .3064432    .4611368          0          1
    control6 |      1,909    .3174437    .4656036          0          1

. 

* standardize all outcome variables for all observations at endline (bihar and Jharkhand)

replace emo_1_norm = (emo1 - .0691461  )/.253769    if year==1
replace emo_2_norm = (emo2 - .0628601  )/.2427748    if year==1
replace emo_3_norm = (emo3 - .0686223 )/.252877    if year==1

replace phy_1_norm = (phy1 - .1257203  )/.3316207    if year==1
replace phy_2_norm = (phy2 - .2221058   )/.4157708    if year==1
replace phy_3_norm = (phy3 - .0811943   )/.2732049     if year==1
replace phy_4_norm = (phy4 - .0749083  )/ .2633123   if year==1
replace phy_5_norm = (phy5 -  .0267156 )/.1612931   if year==1
replace phy_6_norm = (phy6 - .0240964  )/.1533886  if year==1
replace phy_7_norm = (phy7 -  .1199581   )/.3249977   if year==1

replace sexual_1_norm = (sexual1 - .0471451 )/.2120047    if year==1
replace sexual_2_norm = (sexual2 -  .0382399   )/.1918252     if year==1 
replace sexual_3_norm = (sexual3 - .0455736  )/.2086131   if year==1 

replace control_1_norm = (control1 - .3370021)/ .4728095    if year==1
replace control_2_norm = (control2 -  .1000524  )/.3001485   if year==1 
replace control_3_norm = (control3 -  .2536688  )/ .4352243    if year==1
replace control_4_norm = (control4 - .1814368  )/.3854808   if year==1
replace control_5_norm = (control5 - .3064432   )/.4611368   if year==1 
replace control_6_norm = (control6 -  .3174437 )/ .4656036    if year==1

gen emo_sum = emo_1_norm + emo_2_norm + emo_3_norm
gen phy_sum = phy_1_norm + phy_2_norm + phy_3_norm + phy_4_norm + phy_5_norm + phy_6_norm + phy_7_norm
gen sexual_sum = sexual_1_norm + sexual_2_norm + sexual_3_norm
gen control_sum = control_1_norm + control_2_norm + control_3_norm + control_4_norm + control_5_norm + control_6_norm
gen ipv_sum = emo_1_norm + emo_2_norm + emo_3_norm + phy_1_norm + phy_2_norm + phy_3_norm + phy_4_norm + phy_5_norm + phy_6_norm + phy_7_norm + sexual_1_norm + sexual_2_norm + sexual_3_norm + control_1_norm + control_2_norm + control_3_norm + control_4_norm + control_5_norm + control_6_norm


* we standardize the outcome index again separately for baseline and endline datasets 
* for baseline observations 
* mean and sd of outcome index for control group (Jharkhand) at baseline 

sum emo_sum phy_sum sexual_sum control_sum ipv_sum if bihar == 0 & year == 0 


    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
     emo_sum |      2,114   -7.79e-08    2.404446  -.5667549   15.90358
     phy_sum |      2,114   -8.50e-07    4.719783  -1.730423   49.13245
  sexual_sum |      2,114   -2.03e-07    2.437355  -.5044771   18.62376
 control_sum |      2,104    .0013157    3.680527  -3.422285   11.96384
     ipv_sum |      2,104   -.0193072    8.987437  -6.223939   91.15833



* we standardize the index variables for all observations at baseline (bihar and Jharkhand)

gen emo_index_norm     = (emo_sum -(-7.79e-08 ))/2.404446  if year==0
gen phy_index_norm     = (phy_sum - (-8.50e-07))/4.719783 if year==0
gen sexual_index_norm  = (sexual_sum - ( -2.03e-07 ))/ 2.437355 if year==0
gen control_index_norm = (control_sum - (.0013157 ))/ 3.680527  if year==0
gen ipv_index_norm     = (ipv_sum - (-.0193072  ))/ 8.987437   if year==0  

* for endline observations 
* mean and sd of outcome index for control group (Jharkhand) at endline 
sum emo_sum phy_sum sexual_sum control_sum ipv_sum if bihar == 0 & year == 1


    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
     emo_sum |      1,909    3.94e-07    2.546731  -.8027664   11.21136
     phy_sum |      1,909   -2.19e-07    5.219468   -2.18682    26.4881
  sexual_sum |      1,909    8.21e-08    2.717428  -.6401852   14.08333
 control_sum |      1,906   -.0065898    3.841123  -3.445959   11.20887
     ipv_sum |      1,906   -.0008764    11.30406  -7.075731   62.99167



* we standardize the index variables for all observations at endline (bihar and Jharkhand)

replace emo_index_norm     = (emo_sum -(3.94e-07))/ 2.546731  if year==1
replace phy_index_norm     = (phy_sum - (-2.19e-07  ))/ 5.219468   if year==1
replace sexual_index_norm  = (sexual_sum - ( 8.21e-08))/2.717428    if year==1
replace control_index_norm = (control_sum - ( -.0065898   ))/ 3.841123  if year==1
replace ipv_index_norm     = (ipv_sum - (-.0008764 ))/ 11.30406  if year==1   	 


* create quadratic and interaction terms 

* create bihar control variblae for matching 

* we have standardized all outcome variables .

psmatch2 bihar_control resp_age  resp_edu  year_marr  caste  husb_edu poorest poorer middle  richer    religion hhd_gender place_residence hhd_age hh_size  wi1_age wi2_age wi3_age wi4_age   hedu_cas2   hedu_wi1 hedu_wi2 hedu_wi3 hedu_wi4  cas2_rel2    edu_yea rel2_wi1 rel2_wi2 rel2_wi3 rel2_wi4   , kernel common 

* DID TO FIND THE ESTIMATES 

diff emo_index_norm [aw=_weight] , p(year)  t(bihar)  cov(resp_age  resp_edu  year_marr  caste  husb_edu poorest poorer middle  richer    religion hhd_gender place_residence hhd_age hh_size  wi1_age wi2_age wi3_age wi4_age   hedu_cas2   hedu_wi1 hedu_wi2 hedu_wi3 hedu_wi4  cas2_rel2    edu_yea rel2_wi1 rel2_wi2 rel2_wi3 rel2_wi4   dist1 dist2 dist3 dist4 dist5 dist6 dist7 dist8 dist9 dist10 dist11 dist12 dist13 dist14 dist15 dist16 dist17 dist18 dist19 dist20 dist21 dist22 dist23 dist24 dist25 dist26 dist27 dist28 dist29 dist30 dist31 dist32 dist33 dist34 dist35 dist36 dist37 dist38 dist39 dist40 dist41 dist42 dist43 dist44 dist45 dist46 dist47 dist48 dist49 dist50 dist51 dist52 dist53 dist54 dist55 dist56 dist57 dist58 dist59 dist60 dist61 dist62 dist62 ) report cluster(distyear) robust

* run the above command for rest 4 standardized outcome variables 


/* TABLE 9:  test of potential mechanism (ALCOHOL CONSUMPTION AND ITS FREQUENCY) */

* open the main file

*create variables on whther husband consumes alcohol (alco_cons) and frequency of his drinking (alco_f). (these variables have been created in the main file)
* alco_cons takes the value `1' when women surveyed in the domestic violence module of the NFHS reported that their husbands drink alcohol, and `0' otherwise
* alco_f1 takes the value `0' when the husband drinks either `never', or `sometimes', and `1', when he drinks `often'.
* cases of 'never' are clubbed with 'sometimes'
* DID 

diff alco_cons [aw=_weight] , p(year)  t(bihar)  cov(resp_age  resp_edu  year_marr  caste  husb_edu poorest poorer middle  richer    religion hhd_gender place_residence hhd_age hh_size  wi1_age wi2_age wi3_age wi4_age   hedu_cas2   hedu_wi1 hedu_wi2 hedu_wi3 hedu_wi4  cas2_rel2    edu_yea rel2_wi1 rel2_wi2 rel2_wi3 rel2_wi4   dist1 dist2 dist3 dist4 dist5 dist6 dist7 dist8 dist9 dist10 dist11 dist12 dist13 dist14 dist15 dist16 dist17 dist18 dist19 dist20 dist21 dist22 dist23 dist24 dist25 dist26 dist27 dist28 dist29 dist30 dist31 dist32 dist33 dist34 dist35 dist36 dist37 dist38 dist39 dist40 dist41 dist42 dist43 dist44 dist45 dist46 dist47 dist48 dist49 dist50 dist51 dist52 dist53 dist54 dist55 dist56 dist57 dist58 dist59 dist60 dist61 dist62 dist62 ) report cluster(distyear) robust

diff alco_f [aw=_weight] , p(year)  t(bihar)  cov(resp_age  resp_edu  year_marr  caste  husb_edu poorest poorer middle  richer    religion hhd_gender place_residence hhd_age hh_size  wi1_age wi2_age wi3_age wi4_age   hedu_cas2   hedu_wi1 hedu_wi2 hedu_wi3 hedu_wi4  cas2_rel2    edu_yea rel2_wi1 rel2_wi2 rel2_wi3 rel2_wi4   dist1 dist2 dist3 dist4 dist5 dist6 dist7 dist8 dist9 dist10 dist11 dist12 dist13 dist14 dist15 dist16 dist17 dist18 dist19 dist20 dist21 dist22 dist23 dist24 dist25 dist26 dist27 dist28 dist29 dist30 dist31 dist32 dist33 dist34 dist35 dist36 dist37 dist38 dist39 dist40 dist41 dist42 dist43 dist44 dist45 dist46 dist47 dist48 dist49 dist50 dist51 dist52 dist53 dist54 dist55 dist56 dist57 dist58 dist59 dist60 dist61 dist62 dist62 ) report cluster(distyear) robust



/* TABLE 10. Robustness test taking difference in alcohol consumption and ipv measure */

* we create a copy of the practise file 

* this is a within state analysis 
* we keep only observations from Bihar 

* we first calculate the difference in mean of alcohol consumption in pre and post in each district of bihar


* create one baseline and one endline data 

* baseline bihar 

collapse d113, by(sdistri)

* save as baseline collapse

* endline bihar 

collapse d113 , by(sdist)

* saved as endline collapse

* merge two datasaets together

* baseline collapse

* change sdistri into sdist 
rename sdistri sdist
rename d113 d113pre

* save 

* endline collapse

rename d113 d113post

* save 

* now merge the datsets 

* open baseline collapse

merge 1:1 sdist using "C:\Users\Admin\Desktop\EDCC final 24 june\robustness\RObustness difference table 4\endline collapse.dta"

drop _merge

* generate alcohol consumption difference variable 


gen d113_diff = d113post - d113pre
 

* we create rank for alco difference

 egen rankd113 = rank( -d113_diff)


 * we have the ranking of alco diff in the collapsed datset. 
* we assign high value to cases where rankd113>19
* since rank1 shows the smallest decrease and rank38 the highest decrease
gen med_high =1 if rankd113>19
replace med_high=0 if med_high==.

gen district = sdist


* we now have the district status as treated (high diff) or control (low diff) acording to difference in alco cons at pre and post rounds

* we merge this file with the bihar observations (both rounds) from the practise dataset (merge by district variable)



* saved this file as alcocons_ipv.dta

* we now stanrdadize outcome variables




* we standardize outcome variables now 
* We standardize outcome variables separately for baseline and endline datasets 
* for baseline observations 
* mean and sd of outcome variables for control group (med_high=0) at baseline 
summ emo1 emo2 emo3 phy1 phy2 phy3 phy4 phy5 phy6  phy7 sexual1 sexual2 sexual3 control1 control2 control3 control4 control5 control6 if med_high == 0 & year==0



    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
        emo1 |      1,884    .1157113    .3199633          0          1
        emo2 |      1,884    .0854565     .279634          0          1
        emo3 |      1,884    .0950106    .2933075          0          1
        phy1 |      1,884    .1778132    .3824569          0          1
        phy2 |      1,884    .3184713    .4660071          0          1
-------------+---------------------------------------------------------
        phy3 |      1,884    .1332272    .3399103          0          1
        phy4 |      1,884     .104034    .3053857          0          1
        phy5 |      1,884    .0313163    .1742175          0          1
        phy6 |      1,884    .0106157    .1025114          0          1
        phy7 |      1,884     .163482     .369903          0          1
-------------+---------------------------------------------------------
     sexual1 |      1,884     .104034    .3053857          0          1
     sexual2 |      1,884    .0408705    .1980427          0          1
     sexual3 |      1,884    .0605096    .2384918          0          1
    control1 |      1,878     .456869    .4982689          0          1
    control2 |      1,880    .1585106    .3653163          0          1
-------------+---------------------------------------------------------
    control3 |      1,879    .3815859    .4859051          0          1
    control4 |      1,875    .2778667    .4480668          0          1
    control5 |      1,878    .4233227    .4942172          0          1
    control6 |      1,880    .4787234      .49968          0          1

. 
* standardize all outcome variables for all observations at baseline (bihar)

gen emo_1_norm = (emo1 - .1157113 )/.3199633 if year==0
gen emo_2_norm = (emo2 - .0854565)/.279634  if year==0
gen emo_3_norm = (emo3 -  .0950106)/ .2933075  if year==0

gen phy_1_norm = (phy1 - .1778132)/.3824569  if year==0
gen phy_2_norm = (phy2 - .3184713  )/.4660071 if year==0
gen phy_3_norm = (phy3 - .1332272 )/.3399103  if year==0
gen phy_4_norm = (phy4 - .104034 )/ .3053857  if year==0
gen phy_5_norm = (phy5 - .0313163)/.1742175 if year==0
gen phy_6_norm = (phy6 - .0106157 )/.1025114  if year==0
gen phy_7_norm = (phy7 - .163482  )/.369903 if year==0

gen sexual_1_norm = (sexual1 - .104034)/.3053857 if year==0
gen sexual_2_norm = (sexual2 - .0408705 )/.1980427  if year==0 
gen sexual_3_norm = (sexual3 - .0605096 )/.2384918   if year==0  

gen control_1_norm = (control1 - .456869)/.4982689 if year==0
gen control_2_norm = (control2 - .1585106)/.3653163   if year==0 
gen control_3_norm = (control3 - .3815859   )/.4859051 if year==0
gen control_4_norm = (control4 - .2778667)/.4480668 if year==0
gen control_5_norm = (control5 - .4233227)/.4942172 if year==0 
gen control_6_norm = (control6 -  .4787234  )/.49968 if year==0

* for endline observations 
* mean and sd of outcome varibales for control group (med_high=0) at endline
summ emo1 emo2 emo3 phy1 phy2 phy3 phy4 phy5 phy6  phy7 sexual1 sexual2 sexual3 control1 control2 control3 control4 control5 control6 if med_high == 0 & year==1


    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
        emo1 |      1,824    .1244518    .3301867          0          1
        emo2 |      1,824    .0756579    .2645225          0          1
        emo3 |      1,824     .098136      .29758          0          1
        phy1 |      1,824    .1387061    .3457344          0          1
        phy2 |      1,824    .3371711    .4728735          0          1
-------------+---------------------------------------------------------
        phy3 |      1,824    .1134868     .317274          0          1
        phy4 |      1,824    .1047149    .3062697          0          1
        phy5 |      1,824    .0219298    .1464946          0          1
        phy6 |      1,824    .0098684    .0988757          0          1
        phy7 |      1,824    .1496711    .3568465          0          1
-------------+---------------------------------------------------------
     sexual1 |      1,824    .0493421    .2166407          0          1
     sexual2 |      1,824    .0317982    .1755107          0          1
     sexual3 |      1,824    .0455044    .2084648          0          1
    control1 |      1,816    .4542952    .4980438          0          1
    control2 |      1,820    .1582418    .3650678          0          1
-------------+---------------------------------------------------------
    control3 |      1,818    .3278328    .4695527          0          1
    control4 |      1,816    .2549559    .4359565          0          1
    control5 |      1,815    .3195592    .4664343          0          1
    control6 |      1,819    .3991204    .4898522          0          1


* standardize all outcome variables for all observations at endline (bihar)

replace emo_1_norm = (emo1 - .1244518)/.3301867 if year==1
replace emo_2_norm = (emo2 - .0756579)/ .2645225 if year==1
replace emo_3_norm = (emo3 - .098136)/.29758   if year==1

replace phy_1_norm = (phy1 - .1387061)/.3457344   if year==1
replace phy_2_norm = (phy2 - .3371711  )/.4728735  if year==1
replace phy_3_norm = (phy3 - .1134868  )/ .317274  if year==1
replace phy_4_norm = (phy4 - .1047149 )/.3062697  if year==1
replace phy_5_norm = (phy5 - .0219298 )/.1464946  if year==1
replace phy_6_norm = (phy6 - .0098684 )/.0988757   if year==1
replace phy_7_norm = (phy7 - .1496711 )/.3568465  if year==1

replace sexual_1_norm = (sexual1 - .0493421  )/ .2166407    if year==1
replace sexual_2_norm = (sexual2 - .0317982 )/.1755107 if year==1 
replace sexual_3_norm = (sexual3 -  .0455044)/.2084648  if year==1 

replace control_1_norm = (control1 - .4542952  )/.4980438  if year==1
replace control_2_norm = (control2 -  .1582418)/.3650678 if year==1 
replace control_3_norm = (control3 - .3278328 )/ .4695527  if year==1
replace control_4_norm = (control4 - .2549559 )/ .4359565  if year==1
replace control_5_norm = (control5 - .3195592 )/.4664343 if year==1 
replace control_6_norm = (control6 -  .3991204)/.4898522  if year==1


gen emo_sum = emo_1_norm + emo_2_norm + emo_3_norm
gen phy_sum = phy_1_norm + phy_2_norm + phy_3_norm + phy_4_norm + phy_5_norm + phy_6_norm + phy_7_norm
gen sexual_sum = sexual_1_norm + sexual_2_norm + sexual_3_norm
gen control_sum = control_1_norm + control_2_norm + control_3_norm + control_4_norm + control_5_norm + control_6_norm
gen ipv_sum = emo_1_norm + emo_2_norm + emo_3_norm + phy_1_norm + phy_2_norm + phy_3_norm + phy_4_norm + phy_5_norm + phy_6_norm + phy_7_norm + sexual_1_norm + sexual_2_norm + sexual_3_norm + control_1_norm + control_2_norm + control_3_norm + control_4_norm + control_5_norm + control_6_norm


* we standardize the outcome index again separately for baseline and endline datasets 
* for baseline observations 
* mean and sd of outcome index for control group (med_high) at baseline 

sum emo_sum phy_sum sexual_sum control_sum ipv_sum if med_high == 0 & year == 0 


    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
     emo_sum |      1,884   -1.88e-07    2.520982  -.9911689   9.119684
     phy_sum |      1,884    5.16e-08    4.882858   -2.60621   26.56923
  sexual_sum |      1,884   -3.70e-07    2.445185  -.8007542   11.71623
 control_sum |      1,860   -.0019893    3.816929  -4.570879   8.487931
     ipv_sum |      1,860    .0096543    10.18955  -8.969012   55.89307




* we standardize the index variables for all observations at baseline (bihar)

gen emo_index_norm     = (emo_sum -(-1.88e-07))/2.520982  if year==0
gen phy_index_norm     = (phy_sum - (5.16e-08 ))/4.882858 if year==0
gen sexual_index_norm  = (sexual_sum - (-3.70e-07 ))/2.445185  if year==0
gen control_index_norm = (control_sum - (-.0019893))/3.816929  if year==0
gen ipv_index_norm     = (ipv_sum - (.0096543))/10.18955   if year==0   	 


* for endline observations 
* mean and sd of outcome index for control group (med_high==0) at endline 

sum emo_sum phy_sum sexual_sum control_sum ipv_sum if med_high == 0 & year == 1


    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
     emo_sum |      1,824   -2.74e-07    2.422075  -.9927105   9.176717
     phy_sum |      1,824    5.01e-07    4.674319  -2.482746   28.68355
  sexual_sum |      1,824    2.18e-07    2.369505  -.6272187   14.48335
 control_sum |      1,803    .0077049     3.87858  -4.128507   9.227416
     ipv_sum |      1,803    .0167115    9.659628  -8.231182   61.57103




* we standardize the index variables for all observations at endline (bihar)

replace emo_index_norm     = (emo_sum -(-2.74e-07))/2.422075  if year==1
replace phy_index_norm     = (phy_sum - ( 5.01e-07 ))/4.674319  if year==1
replace sexual_index_norm  = (sexual_sum - (2.18e-07))/2.369505  if year==1
replace control_index_norm = (control_sum - (.0077049 ))/3.87858  if year==1
replace ipv_index_norm     = (ipv_sum - (.0167115))/9.659628   if year==1   	 


* we have standardized all outcome variables .

* create quadratic and interaction terms 

* we first create a indicator for med_high=1 and  pre observations. These observations become our treated group for matching purpose. 
* here we match rest three groups  create weights.


gen t1 = 1 if year == 0 & med_high==1
replace t1=0 if t1==.  




* we now match

psmatch2 t1 resp_age  resp_edu  year_marr  caste  husb_edu poorest poorer middle  richer    religion hhd_gender place_residence hhd_age hh_size  wi1_age wi2_age wi3_age wi4_age   hedu_cas2   hedu_wi1 hedu_wi2 hedu_wi3 hedu_wi4  cas2_rel2    edu_yea rel2_wi1 rel2_wi2 rel2_wi3 rel2_wi4   , kernel common 


 
*DID 


diff emo_index_norm [aw=_weight] , p(year)  t(med_high)  cov(resp_age  resp_edu  year_marr  caste  husb_edu poorest poorer middle  richer    religion hhd_gender place_residence hhd_age hh_size  wi1_age wi2_age wi3_age wi4_age   hedu_cas2   hedu_wi1 hedu_wi2 hedu_wi3 hedu_wi4  cas2_rel2    edu_yea rel2_wi1 rel2_wi2 rel2_wi3 rel2_wi4   dist1 dist2 dist3 dist4 dist5 dist6 dist7 dist8 dist9 dist10 dist11 dist12 dist13 dist14 dist15 dist16 dist17 dist18 dist19 dist20 dist21 dist22 dist23 dist24 dist25 dist26 dist27 dist28 dist29 dist30 dist31 dist32 dist33 dist34 dist35 dist36 dist37 dist38 dist39 dist40 dist41 dist42 dist43 dist44 dist45 dist46 dist47 dist48 dist49 dist50 dist51 dist52 dist53 dist54 dist55 dist56 dist57 dist58 dist59 dist60 dist61 dist62 dist62 ) report  robust



* run the DID command for rest 4 outcome variables


/* APPENDIX */ 



/* FIGURE A1 : Association between alcohol consumption and domestic violence in Bihar */ 
* In this column diagram we only show drinkers and non drinkers for only bihar for each outcome variable only 

* open practise file 
* create only bihar file 
* remember the bar graphs do not resemble exactly as in the paper as there is a change in y scale. 

cibar emo_d  ,  over(d113) bargap(20)
 cibar phy_d  ,  over(d113) bargap(20)
 cibar sexual_d  ,  over(d113) bargap(20)
 cibar control_d  ,  over(d113) bargap(20)
 cibar ipv_d  ,  over(d113) bargap(20)
 
 
 
 /* Figure A2: Domestic Violence in Bihar and Jharkhand, in pre- and post-ban periods */ 
 

/* Column diagram for emotional violence */
/* create a copy of practise file for each outcome varibale as the data collapses to mean values only. */ 
/* open ev file. Remember that the file collapses to a mean data file in running these codes */
/* rename v024 as state before running these codes */
replace v024=10 if v024==5
replace v024=20 if v024==15
rename v024 state
replace state=1 if state==10
replace state=3 if state==20
collapse (mean) meanemot= emo_d (sd) sdemot= emo_d (count) n=emo_d, by (state year)
gen hiemot=meanemot + invttail(n-1,0.025)*(sdemot/sqrt(n))
gen lowemot=meanemot - invttail(n-1,0.025)*(sdemot/sqrt(n))


twoway (bar meanemot state, base(0)) (rcap hiemot lowemot state), by(year)

/* Column diagram for physical violence */
/* open pv file. Remember that the file collapses to a mean data file in running these codes */
/* rename v024 as state before running these codes */
replace v024=10 if v024==5
replace v024=20 if v024==15
rename v024 state
replace state=1 if state==10
replace state=3 if state==20
collapse (mean) meanphy= phy_d (sd) sdphy= phy_d (count) n=phy_d, by (state year)
gen hiphy=meanphy + invttail(n-1,0.025)*(sdphy/sqrt(n))
gen lowphy=meanphy - invttail(n-1,0.025)*(sdphy/sqrt(n))
graph twoway (bar meanphy state, base(0)) (rcap hiphy lowphy state), by(year)


/* Column diagram for sexual violence */
/* open sv file. Remember that the file collapses to a mean data file in running these codes */
/* rename v024 as state before running these codes */
replace v024=10 if v024==5
replace v024=20 if v024==15
rename v024 state
replace state=1 if state==10
replace state=3 if state==20
collapse (mean) meansexual= sexual_d (sd) sdsexual= sexual_d (count) n=sexual_d, by (state year)
gen hisexual=meansexual + invttail(n-1,0.025)*(sdsexual/sqrt(n))
gen lowsexual=meansexual - invttail(n-1,0.025)*(sdsexual/sqrt(n))
graph twoway (bar meansexual state, base(0)) (rcap hisexual lowsexual state), by(year)


/* Column diagram for control behaviour */
/* open cb file. Remember that the file collapses to a mean data file in running these codes */
/* rename v024 as state before running these codes */
replace v024=10 if v024==5
replace v024=20 if v024==15
rename v024 state
replace state=1 if state==10
replace state=3 if state==20
collapse (mean) meancontrol= control_d (sd) sdcontrol= control_d (count) n=control_d, by (state year)
gen hicontrol=meancontrol + invttail(n-1,0.025)*(sdcontrol/sqrt(n))
gen lowcontrol=meancontrol - invttail(n-1,0.025)*(sdcontrol/sqrt(n))
graph twoway (bar meancontrol state, base(0)) (rcap hicontrol lowcontrol state), by(year)


/* Column diagram for overall ipv */
/* open ipv file. Remember that the file collapses to a mean data file in running these codes */
/* rename v024 as state before running these codes */
replace v024=10 if v024==5
replace v024=20 if v024==15
rename v024 state
replace state=1 if state==10
replace state=3 if state==20
collapse (mean) meanipv= ipv_d (sd) sdipv= ipv_d (count) n=ipv_d, by (state year)
gen hiipv=meanipv + invttail(n-1,0.025)*(sdipv/sqrt(n))
gen lowipv=meanipv - invttail(n-1,0.025)*(sdipv/sqrt(n))
graph twoway (bar meanipv state , base(0)) (rcap hiipv lowipv state), by(year)


/* FIGURE A3 : Density distributions of propensity scores    */ 

* REFER CODE LINES 2480-2487


/* TABLE A3 */

/*  PANEL A : Impact of alcohol ban on intimate partner violence (DUMMY OUTCOME VARIABLE) */ 
* OPEN THE main DATAFILE

diff emo_d [aw=_weight] , p(year)  t(bihar)  cov(resp_age  resp_edu  year_marr  caste  husb_edu poorest poorer middle  richer    religion hhd_gender place_residence hhd_age hh_size  wi1_age wi2_age wi3_age wi4_age   hedu_cas2   hedu_wi1 hedu_wi2 hedu_wi3 hedu_wi4  cas2_rel2    edu_yea rel2_wi1 rel2_wi2 rel2_wi3 rel2_wi4   dist1 dist2 dist3 dist4 dist5 dist6 dist7 dist8 dist9 dist10 dist11 dist12 dist13 dist14 dist15 dist16 dist17 dist18 dist19 dist20 dist21 dist22 dist23 dist24 dist25 dist26 dist27 dist28 dist29 dist30 dist31 dist32 dist33 dist34 dist35 dist36 dist37 dist38 dist39 dist40 dist41 dist42 dist43 dist44 dist45 dist46 dist47 dist48 dist49 dist50 dist51 dist52 dist53 dist54 dist55 dist56 dist57 dist58 dist59 dist60 dist61 dist62 dist62 ) report cluster(distyear) robust

diff phy_d [aw=_weight] , p(year)  t(bihar)  cov(resp_age  resp_edu  year_marr  caste  husb_edu poorest poorer middle  richer    religion hhd_gender place_residence hhd_age hh_size  wi1_age wi2_age wi3_age wi4_age   hedu_cas2   hedu_wi1 hedu_wi2 hedu_wi3 hedu_wi4  cas2_rel2    edu_yea rel2_wi1 rel2_wi2 rel2_wi3 rel2_wi4   dist1 dist2 dist3 dist4 dist5 dist6 dist7 dist8 dist9 dist10 dist11 dist12 dist13 dist14 dist15 dist16 dist17 dist18 dist19 dist20 dist21 dist22 dist23 dist24 dist25 dist26 dist27 dist28 dist29 dist30 dist31 dist32 dist33 dist34 dist35 dist36 dist37 dist38 dist39 dist40 dist41 dist42 dist43 dist44 dist45 dist46 dist47 dist48 dist49 dist50 dist51 dist52 dist53 dist54 dist55 dist56 dist57 dist58 dist59 dist60 dist61 dist62 dist62 ) report cluster(distyear) robust
 

diff sexual_d [aw=_weight] , p(year)  t(bihar)  cov(resp_age  resp_edu  year_marr  caste  husb_edu poorest poorer middle  richer    religion hhd_gender place_residence hhd_age hh_size  wi1_age wi2_age wi3_age wi4_age   hedu_cas2   hedu_wi1 hedu_wi2 hedu_wi3 hedu_wi4  cas2_rel2    edu_yea rel2_wi1 rel2_wi2 rel2_wi3 rel2_wi4   dist1 dist2 dist3 dist4 dist5 dist6 dist7 dist8 dist9 dist10 dist11 dist12 dist13 dist14 dist15 dist16 dist17 dist18 dist19 dist20 dist21 dist22 dist23 dist24 dist25 dist26 dist27 dist28 dist29 dist30 dist31 dist32 dist33 dist34 dist35 dist36 dist37 dist38 dist39 dist40 dist41 dist42 dist43 dist44 dist45 dist46 dist47 dist48 dist49 dist50 dist51 dist52 dist53 dist54 dist55 dist56 dist57 dist58 dist59 dist60 dist61 dist62 dist62 ) report cluster(distyear) robust

diff control_d [aw=_weight] , p(year)  t(bihar)  cov(resp_age  resp_edu  year_marr  caste  husb_edu poorest poorer middle  richer    religion hhd_gender place_residence hhd_age hh_size  wi1_age wi2_age wi3_age wi4_age   hedu_cas2   hedu_wi1 hedu_wi2 hedu_wi3 hedu_wi4  cas2_rel2    edu_yea rel2_wi1 rel2_wi2 rel2_wi3 rel2_wi4   dist1 dist2 dist3 dist4 dist5 dist6 dist7 dist8 dist9 dist10 dist11 dist12 dist13 dist14 dist15 dist16 dist17 dist18 dist19 dist20 dist21 dist22 dist23 dist24 dist25 dist26 dist27 dist28 dist29 dist30 dist31 dist32 dist33 dist34 dist35 dist36 dist37 dist38 dist39 dist40 dist41 dist42 dist43 dist44 dist45 dist46 dist47 dist48 dist49 dist50 dist51 dist52 dist53 dist54 dist55 dist56 dist57 dist58 dist59 dist60 dist61 dist62 dist62 ) report cluster(distyear) robust

diff ipv_d [aw=_weight] , p(year)  t(bihar)  cov(resp_age  resp_edu  year_marr  caste  husb_edu poorest poorer middle  richer    religion hhd_gender place_residence hhd_age hh_size  wi1_age wi2_age wi3_age wi4_age   hedu_cas2   hedu_wi1 hedu_wi2 hedu_wi3 hedu_wi4  cas2_rel2    edu_yea rel2_wi1 rel2_wi2 rel2_wi3 rel2_wi4   dist1 dist2 dist3 dist4 dist5 dist6 dist7 dist8 dist9 dist10 dist11 dist12 dist13 dist14 dist15 dist16 dist17 dist18 dist19 dist20 dist21 dist22 dist23 dist24 dist25 dist26 dist27 dist28 dist29 dist30 dist31 dist32 dist33 dist34 dist35 dist36 dist37 dist38 dist39 dist40 dist41 dist42 dist43 dist44 dist45 dist46 dist47 dist48 dist49 dist50 dist51 dist52 dist53 dist54 dist55 dist56 dist57 dist58 dist59 dist60 dist61 dist62 dist62 ) report cluster(distyear) robust


/*  PANEL B : Impact of alcohol ban on intimate partner violence (Standard DiD)*/ 
* OPEN THE main DATAFILE

diff emo_index_norm , p(year)  t(bihar)  cov(resp_age  resp_edu  year_marr  caste  husb_edu poorest poorer middle  richer    religion hhd_gender place_residence hhd_age hh_size  wi1_age wi2_age wi3_age wi4_age   hedu_cas2   hedu_wi1 hedu_wi2 hedu_wi3 hedu_wi4  cas2_rel2    edu_yea rel2_wi1 rel2_wi2 rel2_wi3 rel2_wi4   dist1 dist2 dist3 dist4 dist5 dist6 dist7 dist8 dist9 dist10 dist11 dist12 dist13 dist14 dist15 dist16 dist17 dist18 dist19 dist20 dist21 dist22 dist23 dist24 dist25 dist26 dist27 dist28 dist29 dist30 dist31 dist32 dist33 dist34 dist35 dist36 dist37 dist38 dist39 dist40 dist41 dist42 dist43 dist44 dist45 dist46 dist47 dist48 dist49 dist50 dist51 dist52 dist53 dist54 dist55 dist56 dist57 dist58 dist59 dist60 dist61 dist62 dist62 ) report cluster(distyear) robust

diff phy_index_norm  , p(year)  t(bihar)  cov(resp_age  resp_edu  year_marr  caste  husb_edu poorest poorer middle  richer    religion hhd_gender place_residence hhd_age hh_size  wi1_age wi2_age wi3_age wi4_age   hedu_cas2   hedu_wi1 hedu_wi2 hedu_wi3 hedu_wi4  cas2_rel2    edu_yea rel2_wi1 rel2_wi2 rel2_wi3 rel2_wi4   dist1 dist2 dist3 dist4 dist5 dist6 dist7 dist8 dist9 dist10 dist11 dist12 dist13 dist14 dist15 dist16 dist17 dist18 dist19 dist20 dist21 dist22 dist23 dist24 dist25 dist26 dist27 dist28 dist29 dist30 dist31 dist32 dist33 dist34 dist35 dist36 dist37 dist38 dist39 dist40 dist41 dist42 dist43 dist44 dist45 dist46 dist47 dist48 dist49 dist50 dist51 dist52 dist53 dist54 dist55 dist56 dist57 dist58 dist59 dist60 dist61 dist62 dist62 ) report cluster(distyear) robust

diff sexual_index_norm , p(year)  t(bihar)  cov(resp_age  resp_edu  year_marr  caste  husb_edu poorest poorer middle  richer    religion hhd_gender place_residence hhd_age hh_size  wi1_age wi2_age wi3_age wi4_age   hedu_cas2   hedu_wi1 hedu_wi2 hedu_wi3 hedu_wi4  cas2_rel2    edu_yea rel2_wi1 rel2_wi2 rel2_wi3 rel2_wi4   dist1 dist2 dist3 dist4 dist5 dist6 dist7 dist8 dist9 dist10 dist11 dist12 dist13 dist14 dist15 dist16 dist17 dist18 dist19 dist20 dist21 dist22 dist23 dist24 dist25 dist26 dist27 dist28 dist29 dist30 dist31 dist32 dist33 dist34 dist35 dist36 dist37 dist38 dist39 dist40 dist41 dist42 dist43 dist44 dist45 dist46 dist47 dist48 dist49 dist50 dist51 dist52 dist53 dist54 dist55 dist56 dist57 dist58 dist59 dist60 dist61 dist62 dist62 ) report cluster(distyear) robust 

diff control_index_norm  , p(year)  t(bihar)  cov(resp_age  resp_edu  year_marr  caste  husb_edu poorest poorer middle  richer    religion hhd_gender place_residence hhd_age hh_size  wi1_age wi2_age wi3_age wi4_age   hedu_cas2   hedu_wi1 hedu_wi2 hedu_wi3 hedu_wi4  cas2_rel2    edu_yea rel2_wi1 rel2_wi2 rel2_wi3 rel2_wi4   dist1 dist2 dist3 dist4 dist5 dist6 dist7 dist8 dist9 dist10 dist11 dist12 dist13 dist14 dist15 dist16 dist17 dist18 dist19 dist20 dist21 dist22 dist23 dist24 dist25 dist26 dist27 dist28 dist29 dist30 dist31 dist32 dist33 dist34 dist35 dist36 dist37 dist38 dist39 dist40 dist41 dist42 dist43 dist44 dist45 dist46 dist47 dist48 dist49 dist50 dist51 dist52 dist53 dist54 dist55 dist56 dist57 dist58 dist59 dist60 dist61 dist62 dist62 ) report cluster(distyear) robust

diff ipv_index_norm  , p(year)  t(bihar)  cov(resp_age  resp_edu  year_marr  caste  husb_edu poorest poorer middle  richer    religion hhd_gender place_residence hhd_age hh_size  wi1_age wi2_age wi3_age wi4_age   hedu_cas2   hedu_wi1 hedu_wi2 hedu_wi3 hedu_wi4  cas2_rel2    edu_yea rel2_wi1 rel2_wi2 rel2_wi3 rel2_wi4   dist1 dist2 dist3 dist4 dist5 dist6 dist7 dist8 dist9 dist10 dist11 dist12 dist13 dist14 dist15 dist16 dist17 dist18 dist19 dist20 dist21 dist22 dist23 dist24 dist25 dist26 dist27 dist28 dist29 dist30 dist31 dist32 dist33 dist34 dist35 dist36 dist37 dist38 dist39 dist40 dist41 dist42 dist43 dist44 dist45 dist46 dist47 dist48 dist49 dist50 dist51 dist52 dist53 dist54 dist55 dist56 dist57 dist58 dist59 dist60 dist61 dist62 dist62 ) report cluster(distyear) robust


/* Table A4: Likelihood of marriage in pre and post survey rounds */ 
/*Open the selection bias file */





/*  Code for selection bias. Impact of alcohol ban on likelihood of marriage. We do not use husb_edu , yea_marr and its non linear terms as controls since number of observation become very less and this sample is large.  */

* create a dependent variable (married in our dataset) which takes a value of 1 for all married women and 0 for all unmarried ones. (this variable in present in the dataset)

/* Column 1 . OLS regression on all women age 18-49 who were selected and interviewed for the domestic violence module */
/* open the selection bias file and keep only v044=1 observations */ 
keep if v044==1 

/* save this file as  domes_violence */ 



reg married year resp_age  resp_edu    caste   poorest poorer middle  richer    religion hhd_gender place_residence hhd_age hh_size  wi1_age wi2_age wi3_age wi4_age        cas2_rel2     rel2_wi1 rel2_wi2 rel2_wi3 rel2_wi4   dist1 dist2 dist3 dist4 dist5 dist6 dist7 dist8 dist9 dist10 dist11 dist12 dist13 dist14 dist15 dist16 dist17 dist18 dist19 dist20 dist21 dist22 dist23 dist24 dist25 dist26 dist27 dist28 dist29 dist30 dist31 dist32 dist33 dist34 dist35 dist36 dist37 dist38 dist39 dist40 dist41 dist42 dist43 dist44 dist45 dist46 dist47 dist48 dist49 dist50 dist51 dist52 dist53 dist54 dist55 dist56 dist57 dist58 dist59 dist60 dist61 dist62 dist62,  cluster(distyear) robust

/* Column 2 . OLS regression on all women age 18-49 sample */
/*Open the selection bias file */

reg married year resp_age  resp_edu    caste   poorest poorer middle  richer    religion hhd_gender place_residence hhd_age hh_size  wi1_age wi2_age wi3_age wi4_age        cas2_rel2     rel2_wi1 rel2_wi2 rel2_wi3 rel2_wi4   dist1 dist2 dist3 dist4 dist5 dist6 dist7 dist8 dist9 dist10 dist11 dist12 dist13 dist14 dist15 dist16 dist17 dist18 dist19 dist20 dist21 dist22 dist23 dist24 dist25 dist26 dist27 dist28 dist29 dist30 dist31 dist32 dist33 dist34 dist35 dist36 dist37 dist38 dist39 dist40 dist41 dist42 dist43 dist44 dist45 dist46 dist47 dist48 dist49 dist50 dist51 dist52 dist53 dist54 dist55 dist56 dist57 dist58 dist59 dist60 dist61 dist62 dist62,  cluster(distyear) robust

/* Column 3 . DID regression on all women age 18-49 who were selected and interviewed for the domestic violence module */
/* open the domes_violence file */ 

diff married, p(year)  t(bihar)  cov( resp_age  resp_edu    caste   poorest poorer middle  richer    religion hhd_gender place_residence hhd_age hh_size  wi1_age wi2_age wi3_age wi4_age        cas2_rel2     rel2_wi1 rel2_wi2 rel2_wi3 rel2_wi4   dist1 dist2 dist3 dist4 dist5 dist6 dist7 dist8 dist9 dist10 dist11 dist12 dist13 dist14 dist15 dist16 dist17 dist18 dist19 dist20 dist21 dist22 dist23 dist24 dist25 dist26 dist27 dist28 dist29 dist30 dist31 dist32 dist33 dist34 dist35 dist36 dist37 dist38 dist39 dist40 dist41 dist42 dist43 dist44 dist45 dist46 dist47 dist48 dist49 dist50 dist51 dist52 dist53 dist54 dist55 dist56 dist57 dist58 dist59 dist60 dist61 dist62 dist62) cluster(distyear) robust

/* Column 4 . DID regression on all women age 18-49 sample */
/*Open the selection bias file */

diff married, p(year)  t(bihar)  cov( resp_age  resp_edu    caste   poorest poorer middle  richer    religion hhd_gender place_residence hhd_age hh_size  wi1_age wi2_age wi3_age wi4_age        cas2_rel2     rel2_wi1 rel2_wi2 rel2_wi3 rel2_wi4   dist1 dist2 dist3 dist4 dist5 dist6 dist7 dist8 dist9 dist10 dist11 dist12 dist13 dist14 dist15 dist16 dist17 dist18 dist19 dist20 dist21 dist22 dist23 dist24 dist25 dist26 dist27 dist28 dist29 dist30 dist31 dist32 dist33 dist34 dist35 dist36 dist37 dist38 dist39 dist40 dist41 dist42 dist43 dist44 dist45 dist46 dist47 dist48 dist49 dist50 dist51 dist52 dist53 dist54 dist55 dist56 dist57 dist58 dist59 dist60 dist61 dist62 dist62) cluster(distyear) robust



/* Table A5: Sample statistics of control variables */ 


/* Sample statistics of control variables. This table shows mean of control variables of Bihar (combined 15-16 and 19-21) and Jharkhand (combined 15-16 and 19-21)  */
/* open selection bias  file . We use all women interviewed in 18-49 age group in both rounds from bihar and jharkhand*/  

 
estpost ttest resp_age  resp_edu  year_marr  caste  husb_edu poorest poorer middle  richer    religion hhd_gender place_residence hhd_age hh_size  wi1_age wi2_age wi3_age wi4_age   hedu_cas2   hedu_wi1 hedu_wi2 hedu_wi3 hedu_wi4  cas2_rel2    edu_yea rel2_wi1 rel2_wi2 rel2_wi3 rel2_wi4 , by(bihar)

esttab, wide nonumber mtitle("diff.")

/* use below commands to calculate standard errors */

ttest resp_age,  by(bihar)
ttest resp_edu,  by(bihar)
ttest year_marr,  by(bihar)
ttest religion,  by(bihar)
ttest caste,  by(bihar)
ttest husb_edu,  by(bihar)
ttest hh_size,  by(bihar)
ttest hhd_age,  by(bihar)
ttest hhd_gender,  by(bihar)
ttest place_residence,  by(bihar)
ttest poorest,  by(bihar)
ttest poorer,  by(bihar)
ttest middle,  by(bihar)
ttest richer,  by(bihar)


/* Table A7. IPV outcome after excluding emotional and physical violence */

* open the main file 
* we also create one ipv index norm excluding emotional and physical violence indicators
gen ipv_t =   sexual_1_norm + sexual_2_norm + sexual_3_norm + control_1_norm + control_2_norm + control_3_norm + control_4_norm + control_5_norm + control_6_norm

* we standardize the ipv_t outcome index again separately for baseline and endline datasets 


sum ipv_t if bihar == 0 & year == 0 

    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
       ipv_t |      2,512   -.0049449    4.635685  -3.949117   29.58829


* we standardize the index variables for all observations at baseline (bihar and Jharkhand)


gen ipv_t_norm     = (ipv_t - (-.0049449))/4.635685  if year==0   	 


* for endline observations 
* mean and sd of outcome index for control group (Jharkhand) at endline 
sum ipv_t if bihar == 0 & year == 1



    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
       ipv_t |      2,335   -.0038876    5.362361  -4.136858   25.04371


* we standardize the index variables for all observations at endline (bihar and Jharkhand)


replace ipv_t_norm     = (ipv_t - (-.0038876))/ 5.362361  if year==1   	



* we run a DID with weights as done in the main result estimation (ipv_t_norm in the dataset)

diff ipv_t_norm [aw=_weight] , p(year)  t(bihar)  cov(resp_age  resp_edu  year_marr  caste  husb_edu poorest poorer middle  richer    religion hhd_gender place_residence hhd_age hh_size  wi1_age wi2_age wi3_age wi4_age   hedu_cas2   hedu_wi1 hedu_wi2 hedu_wi3 hedu_wi4  cas2_rel2    edu_yea rel2_wi1 rel2_wi2 rel2_wi3 rel2_wi4   dist1 dist2 dist3 dist4 dist5 dist6 dist7 dist8 dist9 dist10 dist11 dist12 dist13 dist14 dist15 dist16 dist17 dist18 dist19 dist20 dist21 dist22 dist23 dist24 dist25 dist26 dist27 dist28 dist29 dist30 dist31 dist32 dist33 dist34 dist35 dist36 dist37 dist38 dist39 dist40 dist41 dist42 dist43 dist44 dist45 dist46 dist47 dist48 dist49 dist50 dist51 dist52 dist53 dist54 dist55 dist56 dist57 dist58 dist59 dist60 dist61 dist62 dist62 ) report cluster(distyear) robust


 /* Table A8: Robustness Check: Synthetic control method  */
 * these codes show how scm_final.dta file (this file is given in the data inventory) has been created on which we may directly run our SCM analysis 
 
 
/* SCM results using difference as mentioned in abadie 2021 paper */

* row bind all observations from 05 06, 15 16 and 19 21 rounds together 

* we will only use 05 06 15 16 and 19 21 datsets
* we  will not use sexual3 here since it was not collected in 05 06 round 
* remember to combine telangana and andhra pradesh in 2015-16 and 2019-21 rounds
* remember to combine ladakh and jammu kashmir in 2019-21 rounds 

* here bihar is treated and rest all are our controls
* we create a state id (state2) for every state (4 for Bihar)
* we  remove states that have wholly or partially prohibited alcohol
* these are gujarat, manipur, mizoram, nagaland and kerala 
* we also remove all union territories
* we  normalize the generated indexes as done in our main result. 
* we do it roundwise.

* open scm_master file 


* mean and sd of outcome variables for control group (rest leaving bihar) at 05 06 round 

summ emo1 emo2 emo3 phy1 phy2 phy3 phy4 phy5 phy6  phy7 sexual1 sexual2  control1 control2 control3 control4 control5 control6 if state2!=4 & year==2005 

    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
        emo1 |     53,937    .0798524    .2710671          0          1
        emo2 |     53,968    .0321116    .1762982          0          1
        emo3 |     53,959    .0507608    .2195108          0          1
        phy1 |     53,972    .0717409    .2580608          0          1
        phy2 |     53,961    .1774986    .3820936          0          1
-------------+---------------------------------------------------------
        phy3 |     53,984     .055535    .2290236          0          1
        phy4 |     53,984    .0575541    .2329005          0          1
        phy5 |     53,985     .011318    .1057831          0          1
        phy6 |     53,990    .0066494    .0812729          0          1
        phy7 |     53,968     .081048    .2729114          0          1
-------------+---------------------------------------------------------
     sexual1 |     53,990    .0543064    .2266233          0          1
     sexual2 |     53,989     .028228    .1656251          0          1
    control1 |     53,742    .2131108    .4095091          0          1
    control2 |     53,895    .0648298    .2462275          0          1
    control3 |     53,888    .1351136     .341848          0          1
-------------+---------------------------------------------------------
    control4 |     53,894    .0803429    .2718258          0          1
    control5 |     53,805    .0994517    .2992704          0          1
    control6 |     53,804    .1537246    .3606879          0          1

. 


* standardize all outcome variables for all observations at 05 06 round (bihar and rest) 

gen emo_1_norm = (emo1 - .0798524)/.2710671  if year==2005
gen emo_2_norm = (emo2 - .0321116 )/.1762982   if year==2005
gen emo_3_norm = (emo3 - .0507608 )/ .2195108   if year==2005

gen phy_1_norm = (phy1 - .0717409)/.2580608   if year==2005
gen phy_2_norm = (phy2 - .1774986)/.3820936  if year==2005
gen phy_3_norm = (phy3 - .055535  )/.2290236  if year==2005
gen phy_4_norm = (phy4 - .0575541  )/.2329005  if year==2005
gen phy_5_norm = (phy5 - .011318 )/.1057831  if year==2005
gen phy_6_norm = (phy6 - .0066494  )/.0812729   if year==2005
gen phy_7_norm = (phy7 -   .081048  )/.2729114   if year==2005

gen sexual_1_norm = (sexual1 - .0543064  )/.2266233  if year==2005
gen sexual_2_norm = (sexual2 - .028228  )/ .1656251    if year==2005


gen control_1_norm = (control1 - .2131108 )/.4095091   if year==2005
gen control_2_norm = (control2 -  .0648298   )/.2462275   if year==2005
gen control_3_norm = (control3 -    .1351136   )/.341848   if year==2005
gen control_4_norm = (control4 - .0803429  )/ .2718258  if year==2005
gen control_5_norm = (control5 - .0994517   )/.2992704   if year==2005
gen control_6_norm = (control6 -  .1537246  )/.3606879 if year==2005



* * mean and sd of outcome variables for control group (rest leaving bihar) at 15 16 round 

summ emo1 emo2 emo3 phy1 phy2 phy3 phy4 phy5 phy6  phy7 sexual1 sexual2  control1 control2 control3 control4 control5 control6 if state2!=4 & year==2015



    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
        emo1 |     50,171    .0656555    .2476813          0          1
        emo2 |     50,171    .0375516     .190111          0          1
        emo3 |     50,171    .0547727    .2275383          0          1
        phy1 |     50,171    .0862052    .2806696          0          1
        phy2 |     50,171    .1831337    .3867799          0          1
-------------+---------------------------------------------------------
        phy3 |     50,171    .0538359    .2256958          0          1
        phy4 |     50,171    .0550318    .2280446          0          1
        phy5 |     50,171    .0103446    .1011821          0          1
        phy6 |     50,171    .0054613    .0736994          0          1
        phy7 |     50,171    .0743258     .262303          0          1
-------------+---------------------------------------------------------
     sexual1 |     50,171    .0392856    .1942757          0          1
     sexual2 |     50,171    .0192143    .1372788          0          1
    control1 |     49,908    .2401619    .4271859          0          1
    control2 |     50,042    .0724591    .2592492          0          1
    control3 |     50,073    .1948156    .3960626          0          1
-------------+---------------------------------------------------------
    control4 |     50,062    .1414846    .3485242          0          1
    control5 |     50,046    .1735204    .3787003          0          1
    control6 |     50,010     .229794    .4207045          0          1

. 
* standardize all outcome variables for all observations at 15 16 round (bihar and rest) 

replace emo_1_norm = (emo1 - .0656555 )/.2476813   if year==2015
replace emo_2_norm = (emo2 - .0375516 )/.190111   if year==2015
replace emo_3_norm = (emo3 - .0547727  )/ .2275383  if year==2015

replace phy_1_norm = (phy1 - .0862052 )/ .2806696   if year==2015
replace phy_2_norm = (phy2 - .1831337)/.3867799   if year==2015
replace phy_3_norm = (phy3 - .0538359  )/.2256958    if year==2015
replace phy_4_norm = (phy4 - .0550318 )/.2280446    if year==2015
replace phy_5_norm = (phy5 - .0103446 )/.1011821  if year==2015
replace phy_6_norm = (phy6 - .0054613 )/ .0736994  if year==2015
replace phy_7_norm = (phy7 -  .0743258   )/.262303   if year==2015

replace sexual_1_norm = (sexual1 - .0392856  )/.1942757  if year==2015
replace sexual_2_norm = (sexual2 -  .0192143  )/.1372788    if year==2015


replace control_1_norm = (control1 - .2401619   )/.4271859   if year==2015
replace control_2_norm = (control2 -  .0724591  )/.2592492   if year==2015
replace control_3_norm = (control3 -    .1948156    )/.3960626    if year==2015
replace control_4_norm = (control4 - .1414846 )/  .3485242    if year==2015
replace control_5_norm = (control5 -  .1735204 )/ .3787003     if year==2015
replace control_6_norm = (control6 -  .229794     )/.4207045   if year==2015



 * mean and sd of outcome variables for control group (rest leaving bihar) at 19 21 round 
 

summ emo1 emo2 emo3 phy1 phy2 phy3 phy4 phy5 phy6  phy7 sexual1 sexual2  control1 control2 control3 control4 control5 control6 if state2!=4 & year==2019



    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
        emo1 |     50,100    .0677046    .2512408          0          1
        emo2 |     50,100    .0451896    .2077219          0          1
        emo3 |     50,100    .0605788    .2385586          0          1
        phy1 |     50,100    .0917764     .288713          0          1
        phy2 |     50,100    .1841118    .3875792          0          1
-------------+---------------------------------------------------------
        phy3 |     50,100    .0536527    .2253333          0          1
        phy4 |     50,100    .0567665    .2313982          0          1
        phy5 |     50,100    .0149701    .1214341          0          1
        phy6 |     50,100    .0096806    .0979138          0          1
        phy7 |     50,100    .0753693    .2639889          0          1
-------------+---------------------------------------------------------
     sexual1 |     50,100    .0349102    .1835542          0          1
     sexual2 |     50,100    .0196607    .1388327          0          1
    control1 |     49,924    .2494792    .4327159          0          1
    control2 |     50,021    .0911417    .2878134          0          1
    control3 |     50,039    .1711265    .3766232          0          1
-------------+---------------------------------------------------------
    control4 |     50,032    .1350536    .3417842          0          1
    control5 |     50,029    .1848128    .3881495          0          1
    control6 |     50,012    .1950532    .3962456          0          1

. 
* standardize all outcome variables for all observations at 19 21 round (bihar and rest) 

replace emo_1_norm = (emo1 - .0677046 )/.2512408    if year==2019
replace emo_2_norm = (emo2 -  .0451896  )/.2077219   if year==2019
replace emo_3_norm = (emo3 - .0605788  )/ .2385586   if year==2019

replace phy_1_norm = (phy1 - .0917764  )/  .288713    if year==2019
replace phy_2_norm = (phy2 - .1841118 )/.3875792    if year==2019
replace phy_3_norm = (phy3 - .0536527   )/ .2253333    if year==2019
replace phy_4_norm = (phy4 - .0567665  )/.2313982    if year==2019
replace phy_5_norm = (phy5 - .0149701 )/.1214341  if year==2019
replace phy_6_norm = (phy6 - .0096806  )/ .0979138   if year==2019
replace phy_7_norm = (phy7 -   .0753693    )/.2639889    if year==2019

replace sexual_1_norm = (sexual1 - .0349102  )/ .1835542  if year==2019
replace sexual_2_norm = (sexual2 -   .0196607  )/ .1388327      if year==2019


replace control_1_norm = (control1 -  .2494792  )/.4327159    if year==2019
replace control_2_norm = (control2 -  .0911417  )/.2878134   if year==2019
replace control_3_norm = (control3 -    .1711265   )/.3766232      if year==2019
replace control_4_norm = (control4 - .1350536  )/ .3417842    if year==2019
replace control_5_norm = (control5 -   .1848128   )/ .3881495      if year==2019
replace control_6_norm = (control6 -   .1950532   )/.3962456     if year==2019


gen emo_sum = emo_1_norm + emo_2_norm + emo_3_norm
gen phy_sum = phy_1_norm + phy_2_norm + phy_3_norm + phy_4_norm + phy_5_norm + phy_6_norm + phy_7_norm
gen sexual_sum = sexual_1_norm + sexual_2_norm 
gen control_sum = control_1_norm + control_2_norm + control_3_norm + control_4_norm + control_5_norm + control_6_norm
gen ipv_sum = emo_1_norm + emo_2_norm + emo_3_norm + phy_1_norm + phy_2_norm + phy_3_norm + phy_4_norm + phy_5_norm + phy_6_norm + phy_7_norm + sexual_1_norm + sexual_2_norm  + control_1_norm + control_2_norm + control_3_norm + control_4_norm + control_5_norm + control_6_norm


* we standardize the outcome index again separately for each round 
* for 05 06 

* mean and sd of outcome index for control group (rest leaving bihar) at 05 06 

sum emo_sum phy_sum sexual_sum control_sum ipv_sum if state2!=4 & year == 2005


 sum emo_sum phy_sum sexual_sum control_sum ipv_sum if state2!=4 & year == 2005

    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
     emo_sum |     53,919   -.0001331    2.434527  -.7079741   13.20894
     phy_sum |     53,918   -.0040209    4.959939  -1.717931   38.85605
  sexual_sum |     53,986   -.0002484    1.797493   -.410066   10.04028
 control_sum |     53,376   -.0106228    3.802373  -2.233022   16.98825
     ipv_sum |     53,214   -.0510758    9.681754  -5.068993   79.09352

	 
	 * we standardize the index variables for all observations 

gen emo_index_norm     = (emo_sum -( -.0001331 ))/ 2.434527  if year==2005
gen phy_index_norm     = (phy_sum - (-.0040209  ))/4.959939   if year==2005
gen sexual_index_norm  = (sexual_sum - (-.0002484))/1.797493  if year==2005
gen control_index_norm = (control_sum - ( -.0106228  ))/ 3.802373   if year==2005
gen ipv_index_norm     = (ipv_sum - ( -.0510758))/9.681754  if year==2005  


* for 15 16 round 

* mean and sd of outcome index for control group (rest leaving bihar) at 15 16 

sum emo_sum phy_sum sexual_sum control_sum ipv_sum if state2!=4 & year == 2015



    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
     emo_sum |     50,171   -4.30e-07     2.39233  -.7033238   12.98907
     phy_sum |     50,171    3.61e-07    4.704399  -1.720176   40.50822
  sexual_sum |     50,171    1.00e-07    1.714318  -.3421813   12.08959
 control_sum |     49,643   -.0034063     3.85453  -2.743938   13.86593
     ipv_sum |     49,643   -.0054434    9.491601  -5.509619    79.4528


replace emo_index_norm     = (emo_sum -( -4.30e-07 ))/2.39233  if year==2015
replace phy_index_norm     = (phy_sum - (3.61e-07  ))/4.704399  if year==2015
replace sexual_index_norm  = (sexual_sum - (1.00e-07  ))/ 1.714318  if year==2015
replace control_index_norm = (control_sum - ( -.0034063 ))/ 3.85453   if year==2015
replace ipv_index_norm     = (ipv_sum - ( -.0054434))/ 9.491601   if year==2015  


* for 19 20 round 

* mean and sd of outcome index for control group (rest leaving bihar) at 19 20 

sum emo_sum phy_sum sexual_sum control_sum ipv_sum if state2!=4 & year == 2019
	 

    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
     emo_sum |     50,100    2.40e-07    2.424561  -.7409663   12.24525
     phy_sum |     50,100   -1.05e-07    4.869253  -1.783982   35.25523
  sexual_sum |     50,100   -2.72e-07    1.730363  -.3318045   12.31909
 control_sum |     49,756   -.0031618    4.052987  -2.711117   13.75535
     ipv_sum |     49,756   -.0078765    10.11412   -5.56787   73.57493


replace emo_index_norm     = (emo_sum -( 2.40e-07  ))/ 2.424561  if year==2019
replace phy_index_norm     = (phy_sum - (-1.05e-07  ))/ 4.869253  if year==2019
replace sexual_index_norm  = (sexual_sum - ( -2.72e-07 ))/ 1.730363  if year==2019
replace control_index_norm = (control_sum - ( -.0031618))/ 4.052987   if year==2019
replace ipv_index_norm     = (ipv_sum - ( -.0078765 ))/ 10.11412   if year==2019 

* we have standardized all outcome variables .



* now we first find the average of outcome variables state wise for each round 

* first create datasets into separate rounds






* now apply collase in each round datset and then append

use "C:\Users\Admin\Desktop\EDCC final 24 june\Synthetic control method\non collapse scm\master10506.dta"
collapse resp_age resp_edu year_marr caste husb_edu religion hhd_gender place_residence hhd_age hh_size phy_d phy1 phy2 phy3 phy4 phy5 phy6 phy7 phy_index control1 control2 control3 control4 control5 control6 control_index control_d emo1 emo2 emo3 emo_index emo_d sexual1 sexual2 sexual_index sexual_d poorest poorer middle richer richest sexual3 emo_1_norm emo_2_norm emo_3_norm phy_1_norm phy_2_norm phy_3_norm phy_4_norm phy_5_norm phy_6_norm phy_7_norm sexual_1_norm sexual_2_norm control_1_norm control_2_norm control_3_norm control_4_norm control_5_norm control_6_norm emo_sum phy_sum sexual_sum control_sum ipv_sum emo_index_norm phy_index_norm sexual_index_norm control_index_norm ipv_index_norm , by(state2)

gen year=2005

save "C:\Users\Admin\Desktop\EDCC final 24 june\Synthetic control method\non collapse scm\1.dta"

use "C:\Users\Admin\Desktop\EDCC final 24 june\Synthetic control method\non collapse scm\master11516.dta"
collapse resp_age resp_edu year_marr caste husb_edu religion hhd_gender place_residence hhd_age hh_size phy_d phy1 phy2 phy3 phy4 phy5 phy6 phy7 phy_index control1 control2 control3 control4 control5 control6 control_index control_d emo1 emo2 emo3 emo_index emo_d sexual1 sexual2 sexual_index sexual_d poorest poorer middle richer richest sexual3 emo_1_norm emo_2_norm emo_3_norm phy_1_norm phy_2_norm phy_3_norm phy_4_norm phy_5_norm phy_6_norm phy_7_norm sexual_1_norm sexual_2_norm control_1_norm control_2_norm control_3_norm control_4_norm control_5_norm control_6_norm emo_sum phy_sum sexual_sum control_sum ipv_sum emo_index_norm phy_index_norm sexual_index_norm control_index_norm ipv_index_norm , by(state2)

gen year=2015
save "C:\Users\Admin\Desktop\EDCC final 24 june\Synthetic control method\non collapse scm\2.dta"


use "C:\Users\Admin\Desktop\EDCC final 24 june\Synthetic control method\non collapse scm\master11921.dta"
collapse resp_age resp_edu year_marr caste husb_edu religion hhd_gender place_residence hhd_age hh_size phy_d phy1 phy2 phy3 phy4 phy5 phy6 phy7 phy_index control1 control2 control3 control4 control5 control6 control_index control_d emo1 emo2 emo3 emo_index emo_d sexual1 sexual2 sexual_index sexual_d poorest poorer middle richer richest sexual3 emo_1_norm emo_2_norm emo_3_norm phy_1_norm phy_2_norm phy_3_norm phy_4_norm phy_5_norm phy_6_norm phy_7_norm sexual_1_norm sexual_2_norm control_1_norm control_2_norm control_3_norm control_4_norm control_5_norm control_6_norm emo_sum phy_sum sexual_sum control_sum ipv_sum emo_index_norm phy_index_norm sexual_index_norm control_index_norm ipv_index_norm , by(state2)

gen year=2019
save "C:\Users\Admin\Desktop\EDCC final 24 june\Synthetic control method\non collapse scm\3.dta"


* append 1.dta, 2.dta and 3.dta to create a 123.dta(combined) file


* since we did not get the result in using the normalized outcome varibales directly. 
* we use the difference approach as stated in abadie 2021 paper 
* we calculate the mean of outcome variables in pre intervention year (here 2005 and 2015) 
* we then create a difference of the outcome variable and the calculated mean in the previous step 



* we merge the 2005 and 2015 datsets to calculate the average of outcome variables of the pre intervention years

* open 1.dta

rename  * *_2005

rename state2_2005 state2

* save 1.dta 


* open 2.dta 

rename  * *_2015
rename state2_2015 state2
 
* save 2.dta

* merge 1.dta and 2.dta



* we calcuate the average of outcome varibales 

gen emo_mean = ( emo_index_norm_2005 + emo_index_norm_2015)/2
gen phy_mean = ( phy_index_norm_2005 + phy_index_norm_2015)/2
gen sexual_mean = ( sexual_index_norm_2005 + sexual_index_norm_2015)/2
gen control_mean = ( control_index_norm_2005 + control_index_norm_2015)/2
gen ipv_mean = ( ipv_index_norm_2005 + ipv_index_norm_2015)/2

* we then only keep the state number and the averages calculated above 

keep state2 emo_mean phy_mean sexual_mean control_mean ipv_mean

* save this above file as scm_avg.dta 

* now we merge the scm.avg.dta file with our 123 file . 
* this inserts the average of outcome varibales in pre intervention years(2005, 2015) to each state 

* save as scm_final.dta (this file is present in the dataset)

* now we calculat the difference between the outcome variable and the average of outcome variable in pre intervention years 

gen emo_diff =  emo_index_norm - emo_mean
gen phy_diff =  phy_index_norm - phy_mean
gen sexual_diff =  sexual_index_norm - sexual_mean
gen control_diff =  control_index_norm - control_mean
gen ipv_diff =  ipv_index_norm - ipv_mean

* save the file 

* here we plan to use only the neighbouring states of Bihar as our donor pool
* this includes Assam, Chattisgarh, Jharkhand, Madhya Pradesh, Orissa, Uttar Pradesh, and West Bengal
* we therefore restrict the scm_final dataset to these eight states (bihar + 7 controls)


keep if state2==3 | state2==4 |state2==5 |state2==12 |state2==15 |state2==21 |state2==27 |state2==29 

* save the file

* for replication of the scm results open scm_final.dta file and run the following commands 

* we tsset and run the synth command . Here we use all pre intervention values of means of controls and pre intervention value of the outcome variable (2015) as controls


tsset  state2  year



 
synth_runner emo_diff resp_age(2005) resp_age(2015)  resp_edu(2005) resp_edu(2015)  year_marr(2005) year_marr(2015)   husb_edu(2005) husb_edu(2015) religion(2005) religion(2015) hhd_gender(2005) hhd_gender(2015) place_residence(2005) place_residence(2015) hhd_age(2005) hhd_age(2015) hh_size(2005) hh_size(2015)  poorest(2005) poorest(2015) poorer(2005) poorer(2015) middle(2005) middle(2015) richer(2005) richer(2015)   emo_diff(2015) , trunit(4) trperiod(2019)   gen_vars


* remember a temp file will be created after each synth runner command
* you need to repopen the analysis file and run tsset command given above before running synth runner for the next outcome variable 
 
 
 * for pre trend result of scm
 
 
 synth_runner emo_diff resp_age(2005) resp_edu(2005)   year_marr(2005)    husb_edu(2005)  religion(2005)  hhd_gender(2005)  place_residence(2005) hhd_age(2005)  hh_size(2005)   poorest(2005)  poorer(2005)  middle(2005)  richer(2005)   , trunit(4) trperiod(2015)
synth_runner phy_diff resp_age(2005) resp_edu(2005)   year_marr(2005)    husb_edu(2005)  religion(2005)  hhd_gender(2005)  place_residence(2005) hhd_age(2005)  hh_size(2005)   poorest(2005)  poorer(2005)  middle(2005)  richer(2005)   , trunit(4) trperiod(2015)
synth_runner sexual_diff resp_age(2005) resp_edu(2005)   year_marr(2005)    husb_edu(2005)  religion(2005)  hhd_gender(2005)  place_residence(2005) hhd_age(2005)  hh_size(2005)   poorest(2005)  poorer(2005)  middle(2005)  richer(2005)   , trunit(4) trperiod(2015)
synth_runner control_diff resp_age(2005) resp_edu(2005)   year_marr(2005)    husb_edu(2005)  religion(2005)  hhd_gender(2005)  place_residence(2005) hhd_age(2005)  hh_size(2005)   poorest(2005)  poorer(2005)  middle(2005)  richer(2005)   , trunit(4) trperiod(2015)
synth_runner ipv_diff resp_age(2005) resp_edu(2005)   year_marr(2005)    husb_edu(2005)  religion(2005)  hhd_gender(2005)  place_residence(2005) hhd_age(2005)  hh_size(2005)   poorest(2005)  poorer(2005)  middle(2005)  richer(2005)   , trunit(4) trperiod(2015)



/* Table A9: Impact of alcohol ban on intimate partner violence  */ 
* OPEN THE main FILE 

* without controls and FE

diff emo_index_norm [aw=_weight] , p(year)  t(bihar)   cluster(distyear) robust

diff phy_index_norm [aw=_weight] , p(year)  t(bihar)   cluster(distyear) robust

diff sexual_index_norm [aw=_weight] , p(year)  t(bihar)   cluster(distyear) robust

diff control_index_norm [aw=_weight] , p(year)  t(bihar)   cluster(distyear) robust

diff ipv_index_norm [aw=_weight] , p(year)  t(bihar)   cluster(distyear) robust


* with controls and FE
 

diff emo_index_norm [aw=_weight] , p(year)  t(bihar)  cov(resp_age  resp_edu  year_marr  caste  husb_edu poorest poorer middle  richer    religion hhd_gender place_residence hhd_age hh_size  wi1_age wi2_age wi3_age wi4_age   hedu_cas2   hedu_wi1 hedu_wi2 hedu_wi3 hedu_wi4  cas2_rel2    edu_yea rel2_wi1 rel2_wi2 rel2_wi3 rel2_wi4   dist1 dist2 dist3 dist4 dist5 dist6 dist7 dist8 dist9 dist10 dist11 dist12 dist13 dist14 dist15 dist16 dist17 dist18 dist19 dist20 dist21 dist22 dist23 dist24 dist25 dist26 dist27 dist28 dist29 dist30 dist31 dist32 dist33 dist34 dist35 dist36 dist37 dist38 dist39 dist40 dist41 dist42 dist43 dist44 dist45 dist46 dist47 dist48 dist49 dist50 dist51 dist52 dist53 dist54 dist55 dist56 dist57 dist58 dist59 dist60 dist61 dist62 dist62 ) report cluster(distyear) robust



diff phy_index_norm [aw=_weight] , p(year)  t(bihar)  cov(resp_age  resp_edu  year_marr  caste  husb_edu poorest poorer middle  richer    religion hhd_gender place_residence hhd_age hh_size  wi1_age wi2_age wi3_age wi4_age   hedu_cas2   hedu_wi1 hedu_wi2 hedu_wi3 hedu_wi4  cas2_rel2    edu_yea rel2_wi1 rel2_wi2 rel2_wi3 rel2_wi4   dist1 dist2 dist3 dist4 dist5 dist6 dist7 dist8 dist9 dist10 dist11 dist12 dist13 dist14 dist15 dist16 dist17 dist18 dist19 dist20 dist21 dist22 dist23 dist24 dist25 dist26 dist27 dist28 dist29 dist30 dist31 dist32 dist33 dist34 dist35 dist36 dist37 dist38 dist39 dist40 dist41 dist42 dist43 dist44 dist45 dist46 dist47 dist48 dist49 dist50 dist51 dist52 dist53 dist54 dist55 dist56 dist57 dist58 dist59 dist60 dist61 dist62 dist62 ) report cluster(distyear) robust

diff sexual_index_norm [aw=_weight] , p(year)  t(bihar)  cov(resp_age  resp_edu  year_marr  caste  husb_edu poorest poorer middle  richer    religion hhd_gender place_residence hhd_age hh_size  wi1_age wi2_age wi3_age wi4_age   hedu_cas2   hedu_wi1 hedu_wi2 hedu_wi3 hedu_wi4  cas2_rel2    edu_yea rel2_wi1 rel2_wi2 rel2_wi3 rel2_wi4   dist1 dist2 dist3 dist4 dist5 dist6 dist7 dist8 dist9 dist10 dist11 dist12 dist13 dist14 dist15 dist16 dist17 dist18 dist19 dist20 dist21 dist22 dist23 dist24 dist25 dist26 dist27 dist28 dist29 dist30 dist31 dist32 dist33 dist34 dist35 dist36 dist37 dist38 dist39 dist40 dist41 dist42 dist43 dist44 dist45 dist46 dist47 dist48 dist49 dist50 dist51 dist52 dist53 dist54 dist55 dist56 dist57 dist58 dist59 dist60 dist61 dist62 dist62 ) report cluster(distyear) robust 

diff control_index_norm [aw=_weight] , p(year)  t(bihar)  cov(resp_age  resp_edu  year_marr  caste  husb_edu poorest poorer middle  richer    religion hhd_gender place_residence hhd_age hh_size  wi1_age wi2_age wi3_age wi4_age   hedu_cas2   hedu_wi1 hedu_wi2 hedu_wi3 hedu_wi4  cas2_rel2    edu_yea rel2_wi1 rel2_wi2 rel2_wi3 rel2_wi4   dist1 dist2 dist3 dist4 dist5 dist6 dist7 dist8 dist9 dist10 dist11 dist12 dist13 dist14 dist15 dist16 dist17 dist18 dist19 dist20 dist21 dist22 dist23 dist24 dist25 dist26 dist27 dist28 dist29 dist30 dist31 dist32 dist33 dist34 dist35 dist36 dist37 dist38 dist39 dist40 dist41 dist42 dist43 dist44 dist45 dist46 dist47 dist48 dist49 dist50 dist51 dist52 dist53 dist54 dist55 dist56 dist57 dist58 dist59 dist60 dist61 dist62 dist62 ) report cluster(distyear) robust

diff ipv_index_norm [aw=_weight] , p(year)  t(bihar)  cov(resp_age  resp_edu  year_marr  caste  husb_edu poorest poorer middle  richer    religion hhd_gender place_residence hhd_age hh_size  wi1_age wi2_age wi3_age wi4_age   hedu_cas2   hedu_wi1 hedu_wi2 hedu_wi3 hedu_wi4  cas2_rel2    edu_yea rel2_wi1 rel2_wi2 rel2_wi3 rel2_wi4   dist1 dist2 dist3 dist4 dist5 dist6 dist7 dist8 dist9 dist10 dist11 dist12 dist13 dist14 dist15 dist16 dist17 dist18 dist19 dist20 dist21 dist22 dist23 dist24 dist25 dist26 dist27 dist28 dist29 dist30 dist31 dist32 dist33 dist34 dist35 dist36 dist37 dist38 dist39 dist40 dist41 dist42 dist43 dist44 dist45 dist46 dist47 dist48 dist49 dist50 dist51 dist52 dist53 dist54 dist55 dist56 dist57 dist58 dist59 dist60 dist61 dist62 dist62 ) report cluster(distyear) robust




/* Table A10: Sub Sample Analysis: Younger vs Older women */


* old and young women heterogeneity analysis 
* we do a sub sample analysis on women below 30 and women above 30 
* we create two subsamples above 30 and below 30 . (from practise file)

* below 30 subsample 


tab v012
keep if v012<31
tab v012



* normalize the outcome variables 

* We standardize outcome variables separately for baseline and endline datasets 
* for baseline observations 
* mean and sd of outcome variables for control group (Jharkhand) at baseline 
summ emo1 emo2 emo3 phy1 phy2 phy3 phy4 phy5 phy6  phy7 sexual1 sexual2 sexual3 control1 control2 control3 control4 control5 control6 if bihar == 0 & year==0 


    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
        emo1 |      1,306    .0467075    .2110925          0          1
        emo2 |      1,306    .0352221    .1844112          0          1
        emo3 |      1,306    .0405819    .1973952          0          1
        phy1 |      1,306    .0788668     .269634          0          1
        phy2 |      1,306    .2090352    .4067754          0          1
-------------+---------------------------------------------------------
        phy3 |      1,306    .0658499    .2481146          0          1
        phy4 |      1,306    .0467075    .2110925          0          1
        phy5 |      1,306    .0099541    .0993103          0          1
        phy6 |      1,306    .0038285    .0617798          0          1
        phy7 |      1,306    .0834609    .2766836          0          1
-------------+---------------------------------------------------------
     sexual1 |      1,306    .0528331    .2237858          0          1
     sexual2 |      1,306    .0199081    .1397381          0          1
     sexual3 |      1,306    .0290965    .1681414          0          1
    control1 |      1,305    .2850575    .4516149          0          1
    control2 |      1,306    .0849923    .2789771          0          1
-------------+---------------------------------------------------------
    control3 |      1,305    .3287356    .4699338          0          1
    control4 |      1,303    .1627015    .3692348          0          1
    control5 |      1,305    .2436782    .4294653          0          1
    control6 |      1,303    .4259401    .4946746          0          1

. 



* standardize all outcome variables for all observations at baseline (bihar and Jharkhand)

gen emo_1_norm = (emo1 - .0467075   )/.2110925   if year==0
gen emo_2_norm = (emo2 -  .0352221 )/ .1844112 if year==0
gen emo_3_norm = (emo3 - .0405819    )/  .1973952     if year==0

gen phy_1_norm = (phy1 - .0788668  )/ .269634    if year==0
gen phy_2_norm = (phy2 - .2090352   )/ .4067754     if year==0
gen phy_3_norm = (phy3 - .0658499     )/ .2481146    if year==0
gen phy_4_norm = (phy4 -   .0467075  )/.2110925   if year==0
gen phy_5_norm = (phy5 - .0099541 )/ .0993103      if year==0
gen phy_6_norm = (phy6 -  .0038285   )/ .0617798    if year==0
gen phy_7_norm = (phy7 -  .0834609 )/ .2766836     if year==0

gen sexual_1_norm = (sexual1 -  .0528331   )/.2237858      if year==0
gen sexual_2_norm = (sexual2 -  .0199081    )/ .1397381     if year==0 
gen sexual_3_norm = (sexual3 - .0290965  )/.1681414     if year==0  

gen control_1_norm = (control1 - .2850575   )/ .4516149    if year==0
gen control_2_norm = (control2 -  .0849923  )/ .2789771   if year==0 
gen control_3_norm = (control3 -  .3287356   )/  .4699338     if year==0
gen control_4_norm = (control4 - .1627015  )/ .3692348   if year==0
gen control_5_norm = (control5 -  .2436782  )/ .4294653    if year==0 
gen control_6_norm = (control6 -  .4259401 )/ .4946746    if year==0


* for endline observations 
* mean and sd of outcome varibales for control group (Jharkhand) at endline

summ emo1 emo2 emo3 phy1 phy2 phy3 phy4 phy5 phy6  phy7 sexual1 sexual2 sexual3 control1 control2 control3 control4 control5 control6 if bihar == 0 & year==1

    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
        emo1 |      1,029    .0728863    .2600762          0          1
        emo2 |      1,029     .058309    .2344408          0          1
        emo3 |      1,029    .0592809    .2362644          0          1
        phy1 |      1,029    .1185617    .3234293          0          1
        phy2 |      1,029    .2254616    .4180892          0          1
-------------+---------------------------------------------------------
        phy3 |      1,029    .0670554    .2502395          0          1
        phy4 |      1,029    .0524781    .2230976          0          1
        phy5 |      1,029    .0194363    .1381199          0          1
        phy6 |      1,029      .02138    .1447178          0          1
        phy7 |      1,029    .1146744    .3187836          0          1
-------------+---------------------------------------------------------
     sexual1 |      1,029    .0388727    .1933855          0          1
     sexual2 |      1,029    .0281827     .165575          0          1
     sexual3 |      1,029    .0408163    .1979607          0          1
    control1 |      1,028    .3608949    .4804939          0          1
    control2 |      1,029    .1146744    .3187836          0          1
-------------+---------------------------------------------------------
    control3 |      1,028    .2422179     .428634          0          1
    control4 |      1,027    .1742941     .379547          0          1
    control5 |      1,029    .2905734    .4542477          0          1
    control6 |      1,029    .3051506    .4606951          0          1




* standardize all outcome variables for all observations at endline (bihar and Jharkhand)

replace emo_1_norm = (emo1 - .0728863 )/.2600762    if year==1
replace emo_2_norm = (emo2 -   .058309   )/ .2344408    if year==1
replace emo_3_norm = (emo3 -    .0592809  )/ .2362644       if year==1

replace phy_1_norm = (phy1 -  .1185617  )/ .3234293     if year==1
replace phy_2_norm = (phy2 - .2254616 )/ .4180892      if year==1
replace phy_3_norm = (phy3 - .0670554   )/ .2502395    if year==1
replace phy_4_norm = (phy4 -  .0524781  )/.2230976     if year==1
replace phy_5_norm = (phy5 -  .0194363  )/ .1381199     if year==1
replace phy_6_norm = (phy6 -   .02138    )/.1447178     if year==1
replace phy_7_norm = (phy7 -  .1146744  )/ .3187836     if year==1

replace sexual_1_norm = (sexual1 - .0388727  )/ .1933855      if year==1
replace sexual_2_norm = (sexual2 -  .0281827   )/  .165575      if year==1 
replace sexual_3_norm = (sexual3 -  .0408163  )/  .1979607     if year==1 

replace control_1_norm = (control1 - .3608949    )/ .4804939       if year==1
replace control_2_norm = (control2 -   .1146744   )/ .3187836     if year==1 
replace control_3_norm = (control3 -    .2422179  )/.428634         if year==1
replace control_4_norm = (control4 -   .1742941    )/  .379547    if year==1
replace control_5_norm = (control5 - .2905734  )/ .4542477    if year==1 
replace control_6_norm = (control6 -   .3051506  )/ .4606951      if year==1

gen emo_sum = emo_1_norm + emo_2_norm + emo_3_norm
gen phy_sum = phy_1_norm + phy_2_norm + phy_3_norm + phy_4_norm + phy_5_norm + phy_6_norm + phy_7_norm
gen sexual_sum = sexual_1_norm + sexual_2_norm + sexual_3_norm
gen control_sum = control_1_norm + control_2_norm + control_3_norm + control_4_norm + control_5_norm + control_6_norm
gen ipv_sum = emo_1_norm + emo_2_norm + emo_3_norm + phy_1_norm + phy_2_norm + phy_3_norm + phy_4_norm + phy_5_norm + phy_6_norm + phy_7_norm + sexual_1_norm + sexual_2_norm + sexual_3_norm + control_1_norm + control_2_norm + control_3_norm + control_4_norm + control_5_norm + control_6_norm


* we standardize the outcome index again separately for baseline and endline datasets 
* for baseline observations 
* mean and sd of outcome index for control group (Jharkhand) at baseline 

sum emo_sum phy_sum sexual_sum control_sum ipv_sum if bihar == 0 & year == 0 

    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
     emo_sum |      1,306   -9.42e-08    2.432628  -.6178502   14.60805
     phy_sum |      1,306   -4.82e-07    4.490254  -1.756896   43.04805
  sexual_sum |      1,306   -9.79e-08     2.47234  -.5516028   17.02058
 control_sum |      1,299   -.0041127    3.680798  -3.504484   11.48059
     ipv_sum |      1,299   -.0136772    8.961273  -6.430833   81.70084




* we standardize the index variables for all observations at baseline (bihar and Jharkhand)

gen emo_index_norm     = (emo_sum -( -9.42e-08 ))/ 2.432628 if year==0
gen phy_index_norm     = (phy_sum - (  -4.82e-07   ))/  4.490254  if year==0
gen sexual_index_norm  = (sexual_sum - (  -9.79e-08   ))/2.47234   if year==0
gen control_index_norm = (control_sum - (-.0041127 ))/ 3.680798  if year==0
gen ipv_index_norm     = (ipv_sum - (   -.0136772  ))/ 8.961273     if year==0  




	 * for endline observations 
* mean and sd of outcome index for control group (Jharkhand) at endline 
sum emo_sum phy_sum sexual_sum control_sum ipv_sum if bihar == 0 & year == 1


    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
     emo_sum |      1,029   -3.59e-08    2.505023  -.7798741   11.56317
     phy_sum |      1,029    5.31e-07     4.95437  -2.057214   29.19201
  sexual_sum |      1,029    8.71e-08    2.690069  -.5774064   15.68468
 control_sum |      1,026   -.0121841    3.945064  -3.437176   11.12073
     ipv_sum |      1,026   -.0021997    10.85949  -6.851671   67.56059





* we standardize the index variables for all observations at endline (bihar and Jharkhand)

replace emo_index_norm     = (emo_sum -(-3.59e-08   ))/  2.505023  if year==1
replace phy_index_norm     = (phy_sum - ( 5.31e-07  ))/ 4.95437  if year==1
replace sexual_index_norm  = (sexual_sum - (8.71e-08   ))/ 2.690069  if year==1
replace control_index_norm = (control_sum - (-.0121841  ))/ 3.945064    if year==1
replace ipv_index_norm     = (ipv_sum - ( -.0021997  ))/  10.85949    if year==1   	 


* we have standardized all outcome variables . 	

* create all non linear terms 

* we first create a indicator for bihar pre observations. These observations become our treated group for matching purpose. 
* here we match rest three groups ( post bihar, pre jharkhand and post jharkhand) to pre bihar and create weights.


gen control_year = 1 if year == 0
replace control_year = 0 if year == 1
gen bihar_control = bihar * control_year

* we now match (here bihar_control takes a value of 1 for pre bihar observations)
* we remove base dummies to address dummy trap 

psmatch2 bihar_control resp_age  resp_edu  year_marr  caste  husb_edu poorest poorer middle  richer    religion hhd_gender place_residence hhd_age hh_size  wi1_age wi2_age wi3_age wi4_age   hedu_cas2   hedu_wi1 hedu_wi2 hedu_wi3 hedu_wi4  cas2_rel2    edu_yea rel2_wi1 rel2_wi2 rel2_wi3 rel2_wi4   , kernel common 


diff emo_index_norm [aw=_weight] , p(year)  t(bihar)  cov(resp_age  resp_edu  year_marr  caste  husb_edu poorest poorer middle  richer    religion hhd_gender place_residence hhd_age hh_size  wi1_age wi2_age wi3_age wi4_age   hedu_cas2   hedu_wi1 hedu_wi2 hedu_wi3 hedu_wi4  cas2_rel2    edu_yea rel2_wi1 rel2_wi2 rel2_wi3 rel2_wi4   dist1 dist2 dist3 dist4 dist5 dist6 dist7 dist8 dist9 dist10 dist11 dist12 dist13 dist14 dist15 dist16 dist17 dist18 dist19 dist20 dist21 dist22 dist23 dist24 dist25 dist26 dist27 dist28 dist29 dist30 dist31 dist32 dist33 dist34 dist35 dist36 dist37 dist38 dist39 dist40 dist41 dist42 dist43 dist44 dist45 dist46 dist47 dist48 dist49 dist50 dist51 dist52 dist53 dist54 dist55 dist56 dist57 dist58 dist59 dist60 dist61 dist62 dist62 ) report cluster(distyear) robust



* above 30 sample 


* open practise file 


keep if v012>30
tab v012

* save as above 30 
* normalize the outcome variables 

* We standardize outcome variables separately for baseline and endline datasets 
* for baseline observations 
* mean and sd of outcome variables for control group (Jharkhand) at baseline 
summ emo1 emo2 emo3 phy1 phy2 phy3 phy4 phy5 phy6  phy7 sexual1 sexual2 sexual3 control1 control2 control3 control4 control5 control6 if bihar == 0 & year==0 

    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
        emo1 |      1,218    .0353038    .1846223          0          1
        emo2 |      1,218    .0426929     .202247          0          1
        emo3 |      1,218    .0287356    .1671312          0          1
        phy1 |      1,218    .0952381    .2936641          0          1
        phy2 |      1,218    .1995074    .3997942          0          1
-------------+---------------------------------------------------------
        phy3 |      1,218    .0821018    .2746325          0          1
        phy4 |      1,218    .0591133    .2359335          0          1
        phy5 |      1,218    .0123153     .110334          0          1
        phy6 |      1,218    .0073892    .0856772          0          1
        phy7 |      1,218    .0944171    .2925282          0          1
-------------+---------------------------------------------------------
     sexual1 |      1,218    .0377668    .1907101          0          1
     sexual2 |      1,218    .0155993    .1239702          0          1
     sexual3 |      1,218    .0287356    .1671312          0          1
    control1 |      1,217    .2465078    .4311548          0          1
    control2 |      1,217    .0657354    .2479209          0          1
-------------+---------------------------------------------------------
    control3 |      1,216    .3108553    .4630341          0          1
    control4 |      1,217    .1741988    .3794363          0          1
    control5 |      1,216    .2417763    .4283356          0          1
    control6 |      1,216    .3922697    .4884572          0          1



. 
* standardize all outcome variables for all observations at baseline (bihar and Jharkhand)

gen emo_1_norm = (emo1 - .0353038  )/ .1846223   if year==0
gen emo_2_norm = (emo2 -  .0426929 )/ .202247   if year==0
gen emo_3_norm = (emo3 - .0287356      )/ .1671312      if year==0

gen phy_1_norm = (phy1 - .0952381 )/  .2936641     if year==0
gen phy_2_norm = (phy2 - .1995074   )/ .3997942      if year==0
gen phy_3_norm = (phy3 - .0821018   )/  .2746325     if year==0
gen phy_4_norm = (phy4 -   .0591133   )/.2359335    if year==0
gen phy_5_norm = (phy5 - .0123153 )/ .110334       if year==0
gen phy_6_norm = (phy6 -  .0073892  )/ .0856772    if year==0
gen phy_7_norm = (phy7 -  .0944171  )/ .2925282     if year==0

gen sexual_1_norm = (sexual1 -  .0377668    )/.1907101       if year==0
gen sexual_2_norm = (sexual2 -  .0155993   )/ .1239702     if year==0 
gen sexual_3_norm = (sexual3 - .0287356  )/.1671312      if year==0  

gen control_1_norm = (control1 - .2465078   )/ .4311548     if year==0
gen control_2_norm = (control2 -  .0657354 )/ .2479209    if year==0 
gen control_3_norm = (control3 -   .3108553   )/ .4630341     if year==0
gen control_4_norm = (control4 - .1741988  )/ .3794363     if year==0
gen control_5_norm = (control5 -  .2417763  )/ .4283356    if year==0 
gen control_6_norm = (control6 -   .3922697  )/ .4884572      if year==0




* for endline observations 
* mean and sd of outcome varibales for control group (Jharkhand) at endline

summ emo1 emo2 emo3 phy1 phy2 phy3 phy4 phy5 phy6  phy7 sexual1 sexual2 sexual3 control1 control2 control3 control4 control5 control6 if bihar == 0 & year==1


    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
        emo1 |      1,310    .0725191    .2594445          0          1
        emo2 |      1,310    .0725191    .2594445          0          1
        emo3 |      1,310    .0770992    .2668507          0          1
        phy1 |      1,310     .140458    .3475943          0          1
        phy2 |      1,310    .2694656    .4438517          0          1
-------------+---------------------------------------------------------
        phy3 |      1,310    .0992366    .2990937          0          1
        phy4 |      1,310    .0908397    .2874908          0          1
        phy5 |      1,310    .0328244    .1782449          0          1
        phy6 |      1,310    .0290076     .167892          0          1
        phy7 |      1,310    .1366412    .3435994          0          1
-------------+---------------------------------------------------------
     sexual1 |      1,310    .0541985    .2264954          0          1
     sexual2 |      1,310    .0450382    .2074671          0          1
     sexual3 |      1,310    .0496183     .217238          0          1
    control1 |      1,310    .3389313    .4735272          0          1
    control2 |      1,310     .110687     .313864          0          1
-------------+---------------------------------------------------------
    control3 |      1,310    .2770992    .4477368          0          1
    control4 |      1,310    .1938931    .3954971          0          1
    control5 |      1,310    .3053435    .4607287          0          1
    control6 |      1,309    .3307869    .4706763          0          1




* standardize all outcome variables for all observations at endline (bihar and Jharkhand)

replace emo_1_norm = (emo1 - .0725191  )/.2594445    if year==1
replace emo_2_norm = (emo2 -  .0725191   )/  .2594445     if year==1
replace emo_3_norm = (emo3 -    .0770992  )/ .2668507        if year==1

replace phy_1_norm = (phy1 -  .140458  )/ .3475943     if year==1
replace phy_2_norm = (phy2 - .2694656 )/ .4438517     if year==1
replace phy_3_norm = (phy3 - .0992366   )/ .2990937      if year==1
replace phy_4_norm = (phy4 -  .0908397   )/.2874908       if year==1
replace phy_5_norm = (phy5 -  .0328244  )/ .1782449       if year==1
replace phy_6_norm = (phy6 -   .0290076    )/.167892      if year==1
replace phy_7_norm = (phy7 -  .1366412 )/ .3435994     if year==1

replace sexual_1_norm = (sexual1 - .0541985  )/ .2264954       if year==1
replace sexual_2_norm = (sexual2 -  .0450382  )/  .2074671      if year==1 
replace sexual_3_norm = (sexual3 -  .0496183    )/ .217238       if year==1 

replace control_1_norm = (control1 - .3389313   )/ .4735272        if year==1
replace control_2_norm = (control2 -   .110687    )/ .313864     if year==1 
replace control_3_norm = (control3 -   .2770992  )/.4477368         if year==1
replace control_4_norm = (control4 -   .1938931   )/ .3954971     if year==1
replace control_5_norm = (control5 -  .3053435  )/  .4607287     if year==1 
replace control_6_norm = (control6 -   .3307869   )/ .4706763       if year==1

gen emo_sum = emo_1_norm + emo_2_norm + emo_3_norm
gen phy_sum = phy_1_norm + phy_2_norm + phy_3_norm + phy_4_norm + phy_5_norm + phy_6_norm + phy_7_norm
gen sexual_sum = sexual_1_norm + sexual_2_norm + sexual_3_norm
gen control_sum = control_1_norm + control_2_norm + control_3_norm + control_4_norm + control_5_norm + control_6_norm
gen ipv_sum = emo_1_norm + emo_2_norm + emo_3_norm + phy_1_norm + phy_2_norm + phy_3_norm + phy_4_norm + phy_5_norm + phy_6_norm + phy_7_norm + sexual_1_norm + sexual_2_norm + sexual_3_norm + control_1_norm + control_2_norm + control_3_norm + control_4_norm + control_5_norm + control_6_norm


* we standardize the outcome index again separately for baseline and endline datasets 
* for baseline observations 
* mean and sd of outcome index for control group (Jharkhand) at baseline 

sum emo_sum phy_sum sexual_sum control_sum ipv_sum if bihar == 0 & year == 0 

    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
     emo_sum |      1,218    2.50e-07    2.375981   -.574249   15.76999
     phy_sum |      1,218   -8.55e-07    4.835678  -1.893463   36.04637
  sexual_sum |      1,218    7.23e-07    2.444838  -.4957979   18.79754
 control_sum |      1,213    .0071437     3.70386  -3.334863   12.19507
     ipv_sum |      1,213   -.0283257    9.408564  -6.298372    79.3905



* we standardize the index variables for all observations at baseline (bihar and Jharkhand)

gen emo_index_norm     = (emo_sum -( 2.50e-07  ))/ 2.375981 if year==0
gen phy_index_norm     = (phy_sum - ( -8.55e-07   ))/ 4.835678  if year==0
gen sexual_index_norm  = (sexual_sum - ( 7.23e-07   ))/  2.444838   if year==0
gen control_index_norm = (control_sum - ( .0071437  ))/ 3.70386  if year==0
gen ipv_index_norm     = (ipv_sum - ( -.0283257   ))/  9.408564     if year==0 



	 * for endline observations 
* mean and sd of outcome index for control group (Jharkhand) at endline 
sum emo_sum phy_sum sexual_sum control_sum ipv_sum if bihar == 0 & year == 1


    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
     emo_sum |      1,310    8.78e-09    2.580654  -.8479562   10.60824
     phy_sum |      1,310    6.77e-07    5.216771  -2.413563     24.015
  sexual_sum |      1,310   -2.28e-07    2.713724  -.6847831    13.1536
 control_sum |      1,309     .000512    3.868903  -3.543089   10.81181
     ipv_sum |      1,309   -.0004174    11.35749  -7.489392   58.58865



* we standardize the index variables for all observations at endline (bihar and Jharkhand)

replace emo_index_norm     = (emo_sum -( 8.78e-09  ))/ 2.580654  if year==1
replace phy_index_norm     = (phy_sum - (  6.77e-07   ))/  5.216771  if year==1
replace sexual_index_norm  = (sexual_sum - (-2.28e-07   ))/ 2.713724   if year==1
replace control_index_norm = (control_sum - (.000512  ))/ 3.868903   if year==1
replace ipv_index_norm     = (ipv_sum - ( -.0004174  ))/   11.35749     if year==1   	 


* we have standardized all outcome variables .

* create non linear terms 

* we first create a indicator for bihar pre observations. These observations become our treated group for matching purpose. 
* here we match rest three groups ( post bihar, pre jharkhand and post jharkhand) to pre bihar and create weights.


gen control_year = 1 if year == 0
replace control_year = 0 if year == 1
gen bihar_control = bihar * control_year

* we now match (here bihar_control takes a value of 1 for pre bihar observations)
* we remove base dummies to address dummy trap 

psmatch2 bihar_control resp_age  resp_edu  year_marr  caste  husb_edu poorest poorer middle  richer    religion hhd_gender place_residence hhd_age hh_size  wi1_age wi2_age wi3_age wi4_age   hedu_cas2   hedu_wi1 hedu_wi2 hedu_wi3 hedu_wi4  cas2_rel2    edu_yea rel2_wi1 rel2_wi2 rel2_wi3 rel2_wi4   , kernel common 


diff emo_index_norm [aw=_weight] , p(year)  t(bihar)  cov(resp_age  resp_edu  year_marr  caste  husb_edu poorest poorer middle  richer    religion hhd_gender place_residence hhd_age hh_size  wi1_age wi2_age wi3_age wi4_age   hedu_cas2   hedu_wi1 hedu_wi2 hedu_wi3 hedu_wi4  cas2_rel2    edu_yea rel2_wi1 rel2_wi2 rel2_wi3 rel2_wi4   dist1 dist2 dist3 dist4 dist5 dist6 dist7 dist8 dist9 dist10 dist11 dist12 dist13 dist14 dist15 dist16 dist17 dist18 dist19 dist20 dist21 dist22 dist23 dist24 dist25 dist26 dist27 dist28 dist29 dist30 dist31 dist32 dist33 dist34 dist35 dist36 dist37 dist38 dist39 dist40 dist41 dist42 dist43 dist44 dist45 dist46 dist47 dist48 dist49 dist50 dist51 dist52 dist53 dist54 dist55 dist56 dist57 dist58 dist59 dist60 dist61 dist62 dist62 ) report cluster(distyear) robust






/* Table A11: Sub Sample Analysis: Age at marriage */ 

* heterogeneity using age at marriage 

* open practise file 

* the age at marriage runs from 0 till 42 years 
* we will divide the subsample into under 18 and 18 and above subsample 


* under 18 

keep if v511<18
tab v511


* 18 and above

keep if v511>17
tab v511



* under 18 sub sample 

* normalize the outcome variables 

* We standardize outcome variables separately for baseline and endline datasets 
* for baseline observations 
* mean and sd of outcome variables for control group (Jharkhand) at baseline 
summ emo1 emo2 emo3 phy1 phy2 phy3 phy4 phy5 phy6  phy7 sexual1 sexual2 sexual3 control1 control2 control3 control4 control5 control6 if bihar == 0 & year==0 

    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
        emo1 |      1,366    .0431918    .2033631          0          1
        emo2 |      1,366     .045388    .2082298          0          1
        emo3 |      1,366    .0417277    .2000394          0          1
        phy1 |      1,366    .1054173    .3072028          0          1
        phy2 |      1,366    .2247438    .4175663          0          1
-------------+---------------------------------------------------------
        phy3 |      1,366    .0937042    .2915235          0          1
        phy4 |      1,366    .0644217    .2455925          0          1
        phy5 |      1,366    .0146413    .1201561          0          1
        phy6 |      1,366    .0058565    .0763314          0          1
        phy7 |      1,366    .1098097    .3127669          0          1
-------------+---------------------------------------------------------
     sexual1 |      1,366    .0512445    .2205768          0          1
     sexual2 |      1,366    .0183016    .1340889          0          1
     sexual3 |      1,366    .0278184    .1645126          0          1
    control1 |      1,365    .2864469    .4522664          0          1
    control2 |      1,365    .0879121    .2832708          0          1
-------------+---------------------------------------------------------
    control3 |      1,364    .3123167    .4636082          0          1
    control4 |      1,363     .165077    .3713863          0          1
    control5 |      1,365    .2483516    .4322152          0          1
    control6 |      1,363    .4064563    .4913519          0          1




.* standardize all outcome variables for all observations at baseline (bihar and Jharkhand)

gen emo_1_norm = (emo1 - .0431918   )/ .2033631      if year==0
gen emo_2_norm = (emo2 -  .045388  )/  .2082298    if year==0
gen emo_3_norm = (emo3 - .0417277      )/ .2000394        if year==0

gen phy_1_norm = (phy1 -  .1054173 )/ .3072028      if year==0
gen phy_2_norm = (phy2 - .2247438   )/ .4175663       if year==0
gen phy_3_norm = (phy3 - .0937042  )/ .2915235      if year==0
gen phy_4_norm = (phy4 -   .0644217    )/ .2455925     if year==0
gen phy_5_norm = (phy5 - .0146413  )/ .1201561       if year==0
gen phy_6_norm = (phy6 -  .0058565  )/  .0763314    if year==0
gen phy_7_norm = (phy7 -  .1098097   )/  .3127669     if year==0

gen sexual_1_norm = (sexual1 - .0512445   )/.2205768        if year==0
gen sexual_2_norm = (sexual2 - .0183016  )/ .1340889     if year==0 
gen sexual_3_norm = (sexual3 - .0278184   )/.1645126       if year==0  

gen control_1_norm = (control1 - .2864469   )/ .4522664     if year==0
gen control_2_norm = (control2 -  .0879121 )/ .2832708     if year==0 
gen control_3_norm = (control3 -   .3123167   )/ .4636082      if year==0
gen control_4_norm = (control4 -  .165077  )/  .3713863      if year==0
gen control_5_norm = (control5 -  .2483516   )/ .4322152    if year==0 
gen control_6_norm = (control6 -   .4064563  )/   .4913519      if year==0
 

* for endline observations 
* mean and sd of outcome varibales for control group (Jharkhand) at endline

summ emo1 emo2 emo3 phy1 phy2 phy3 phy4 phy5 phy6  phy7 sexual1 sexual2 sexual3 control1 control2 control3 control4 control5 control6 if bihar == 0 & year==1

    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
        emo1 |      1,155    .0787879    .2695241          0          1
        emo2 |      1,155    .0770563    .2667962          0          1
        emo3 |      1,155    .0822511    .2748659          0          1
        phy1 |      1,155    .1471861    .3544449          0          1
        phy2 |      1,155    .2779221    .4481688          0          1
-------------+---------------------------------------------------------
        phy3 |      1,155    .1030303    .3041301          0          1
        phy4 |      1,155    .0874459    .2826097          0          1
        phy5 |      1,155    .0294372    .1691019          0          1
        phy6 |      1,155    .0277056    .1641992          0          1
        phy7 |      1,155    .1480519    .3553053          0          1
-------------+---------------------------------------------------------
     sexual1 |      1,155    .0562771    .2305558          0          1
     sexual2 |      1,155     .038961     .193586          0          1
     sexual3 |      1,155    .0545455    .2271892          0          1
    control1 |      1,155    .3575758    .4794939          0          1
    control2 |      1,155    .1186147    .3234747          0          1
-------------+---------------------------------------------------------
    control3 |      1,155    .2606061    .4391555          0          1
    control4 |      1,154    .1793761    .3838333          0          1
    control5 |      1,155    .3064935    .4612368          0          1
    control6 |      1,154    .3275563    .4695255          0          1


* standardize all outcome variables for all observations at endline (bihar and Jharkhand)

replace emo_1_norm = (emo1 - .0787879  )/.2695241     if year==1
replace emo_2_norm = (emo2 -  .0770563   )/  .2667962      if year==1
replace emo_3_norm = (emo3 -    .0822511  )/ .2748659        if year==1

replace phy_1_norm = (phy1 -  .1471861 )/ .3544449      if year==1
replace phy_2_norm = (phy2 -  .2779221 )/ .4481688     if year==1
replace phy_3_norm = (phy3 - .1030303   )/  .3041301       if year==1
replace phy_4_norm = (phy4 -  .0874459   )/.2826097       if year==1
replace phy_5_norm = (phy5 -   .0294372   )/ .1691019        if year==1
replace phy_6_norm = (phy6 -   .0277056   )/.1641992       if year==1
replace phy_7_norm = (phy7 -  .1480519  )/ .3553053      if year==1

replace sexual_1_norm = (sexual1 -  .0562771  )/  .2305558         if year==1
replace sexual_2_norm = (sexual2 -   .038961   )/ .193586       if year==1 
replace sexual_3_norm = (sexual3 -  .0545455    )/ .2271892       if year==1 

replace control_1_norm = (control1 -  .3575758   )/ .4794939         if year==1
replace control_2_norm = (control2 -   .1186147    )/  .3234747       if year==1 
replace control_3_norm = (control3 -    .2606061  )/ .4391555        if year==1
replace control_4_norm = (control4 -   .1793761   )/  .3838333      if year==1
replace control_5_norm = (control5 -  .3064935   )/  .4612368      if year==1 
replace control_6_norm = (control6 -  .3275563   )/ .4695255        if year==1

gen emo_sum = emo_1_norm + emo_2_norm + emo_3_norm
gen phy_sum = phy_1_norm + phy_2_norm + phy_3_norm + phy_4_norm + phy_5_norm + phy_6_norm + phy_7_norm
gen sexual_sum = sexual_1_norm + sexual_2_norm + sexual_3_norm
gen control_sum = control_1_norm + control_2_norm + control_3_norm + control_4_norm + control_5_norm + control_6_norm
gen ipv_sum = emo_1_norm + emo_2_norm + emo_3_norm + phy_1_norm + phy_2_norm + phy_3_norm + phy_4_norm + phy_5_norm + phy_6_norm + phy_7_norm + sexual_1_norm + sexual_2_norm + sexual_3_norm + control_1_norm + control_2_norm + control_3_norm + control_4_norm + control_5_norm + control_6_norm

* we standardize the outcome index again separately for baseline and endline datasets 
* for baseline observations 
* mean and sd of outcome index for control group (Jharkhand) at baseline 

sum emo_sum phy_sum sexual_sum control_sum ipv_sum if bihar == 0 & year == 0 

    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
     emo_sum |      1,366   -1.45e-07    2.415713  -.6389557   14.07976
     phy_sum |      1,366    2.09e-09    4.667551  -2.014784   35.75781
  sexual_sum |      1,366    4.08e-07    2.473226  -.5379049   17.53196
 control_sum |      1,360   -.0040584    3.670182  -3.463681   11.47607
     ipv_sum |      1,360   -.0325723    9.112768  -6.655326   73.99599


* we standardize the index variables for all observations at baseline (bihar and Jharkhand)

gen emo_index_norm     = (emo_sum -(  -1.45e-07  ))/  2.415713  if year==0
gen phy_index_norm     = (phy_sum - (  2.09e-09   ))/  4.667551   if year==0
gen sexual_index_norm  = (sexual_sum - (  4.08e-07   ))/   2.473226    if year==0
gen control_index_norm = (control_sum - ( -.0040584  ))/ 3.670182  if year==0
gen ipv_index_norm     = (ipv_sum - ( -.0325723  ))/ 9.112768    if year==0 



	 * for endline observations 
* mean and sd of outcome index for control group (Jharkhand) at endline 
sum emo_sum phy_sum sexual_sum control_sum ipv_sum if bihar == 0 & year == 1


    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
     emo_sum |      1,155   -2.24e-07    2.611446   -.880384   10.21618
     phy_sum |      1,155    4.72e-07    5.159167   -2.44308   24.25429
  sexual_sum |      1,155   -1.75e-07    2.708018   -.685441   13.21918
 control_sum |      1,153   -.0031288    3.869168  -3.535315   10.82194
     ipv_sum |      1,153    -.000551    11.23868   -7.54422   58.51159




* we standardize the index variables for all observations at endline (bihar and Jharkhand)

replace emo_index_norm     = (emo_sum -( -2.24e-07 ))/  2.611446   if year==1
replace phy_index_norm     = (phy_sum - ( 4.72e-07   ))/  5.159167   if year==1
replace sexual_index_norm  = (sexual_sum - (-1.75e-07   ))/  2.708018   if year==1
replace control_index_norm = (control_sum - (-.0031288  ))/  3.869168    if year==1
replace ipv_index_norm     = (ipv_sum - (  -.000551  ))/ 11.23868     if year==1   	 


* we have standardized all outcome variables .

* create non linear terms

* we first create a indicator for bihar pre observations. These observations become our treated group for matching purpose. 
* here we match rest three groups ( post bihar, pre jharkhand and post jharkhand) to pre bihar and create weights.


gen control_year = 1 if year == 0
replace control_year = 0 if year == 1
gen bihar_control = bihar * control_year

* we now match (here bihar_control takes a value of 1 for pre bihar observations)
* we remove base dummies to address dummy trap 

psmatch2 bihar_control resp_age  resp_edu  year_marr  caste  husb_edu poorest poorer middle  richer    religion hhd_gender place_residence hhd_age hh_size  wi1_age wi2_age wi3_age wi4_age   hedu_cas2   hedu_wi1 hedu_wi2 hedu_wi3 hedu_wi4  cas2_rel2    edu_yea rel2_wi1 rel2_wi2 rel2_wi3 rel2_wi4   , kernel common 


diff emo_index_norm [aw=_weight] , p(year)  t(bihar)  cov(resp_age  resp_edu  year_marr  caste  husb_edu poorest poorer middle  richer    religion hhd_gender place_residence hhd_age hh_size  wi1_age wi2_age wi3_age wi4_age   hedu_cas2   hedu_wi1 hedu_wi2 hedu_wi3 hedu_wi4  cas2_rel2    edu_yea rel2_wi1 rel2_wi2 rel2_wi3 rel2_wi4   dist1 dist2 dist3 dist4 dist5 dist6 dist7 dist8 dist9 dist10 dist11 dist12 dist13 dist14 dist15 dist16 dist17 dist18 dist19 dist20 dist21 dist22 dist23 dist24 dist25 dist26 dist27 dist28 dist29 dist30 dist31 dist32 dist33 dist34 dist35 dist36 dist37 dist38 dist39 dist40 dist41 dist42 dist43 dist44 dist45 dist46 dist47 dist48 dist49 dist50 dist51 dist52 dist53 dist54 dist55 dist56 dist57 dist58 dist59 dist60 dist61 dist62 dist62 ) report cluster(distyear) robust



* 18 and above 

* normalize the outcome variables 

* We standardize outcome variables separately for baseline and endline datasets 
* for baseline observations 
* mean and sd of outcome variables for control group (Jharkhand) at baseline 
summ emo1 emo2 emo3 phy1 phy2 phy3 phy4 phy5 phy6  phy7 sexual1 sexual2 sexual3 control1 control2 control3 control4 control5 control6 if bihar == 0 & year==0 

    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
        emo1 |      1,158    .0388601     .193345          0          1
        emo2 |      1,158    .0310881    .1736308          0          1
        emo3 |      1,158    .0267703    .1614812          0          1
        phy1 |      1,158    .0647668    .2462203          0          1
        phy2 |      1,158    .1804836    .3847559          0          1
-------------+---------------------------------------------------------
        phy3 |      1,158    .0500864    .2182174          0          1
        phy4 |      1,158    .0388601     .193345          0          1
        phy5 |      1,158    .0069085    .0828653          0          1
        phy6 |      1,158    .0051813    .0718259          0          1
        phy7 |      1,158    .0639033    .2446862          0          1
-------------+---------------------------------------------------------
     sexual1 |      1,158    .0388601     .193345          0          1
     sexual2 |      1,158    .0172712    .1303362          0          1
     sexual3 |      1,158    .0302245    .1712785          0          1
    control1 |      1,157    .2428695    .4290023          0          1
    control2 |      1,158    .0613126    .2400065          0          1
-------------+---------------------------------------------------------
    control3 |      1,157    .3292999    .4701622          0          1
    control4 |      1,157    .1719965    .3775406          0          1
    control5 |      1,156    .2361592    .4249049          0          1
    control6 |      1,156    .4134948    .4926731          0          1



* standardize all outcome variables for all observations at baseline (bihar and Jharkhand)

gen emo_1_norm = (emo1 - .0388601   )/  .193345      if year==0
gen emo_2_norm = (emo2 -   .0310881   )/  .1736308     if year==0
gen emo_3_norm = (emo3 - .0267703   )/  .1614812          if year==0

gen phy_1_norm = (phy1 -   .0647668   )/ .2462203       if year==0
gen phy_2_norm = (phy2 - .1804836   )/ .3847559        if year==0
gen phy_3_norm = (phy3 - .0500864 )/ .2182174      if year==0
gen phy_4_norm = (phy4 -  .0388601  )/ .193345      if year==0
gen phy_5_norm = (phy5 - .0069085  )/  .0828653       if year==0
gen phy_6_norm = (phy6 -  .0051813  )/ .0718259     if year==0
gen phy_7_norm = (phy7 -  .0639033   )/  .2446862       if year==0

gen sexual_1_norm = (sexual1 -  .0388601  )/ .193345        if year==0
gen sexual_2_norm = (sexual2 - .0172712 )/ .1303362      if year==0 
gen sexual_3_norm = (sexual3 - .0302245  )/.1712785        if year==0  

gen control_1_norm = (control1 -  .2428695   )/  .4290023      if year==0
gen control_2_norm = (control2 -  .0613126  )/ .2400065    if year==0 
gen control_3_norm = (control3 -   .3292999   )/  .4701622       if year==0
gen control_4_norm = (control4 -   .1719965 )/  .3775406      if year==0
gen control_5_norm = (control5 -  .2361592  )/  .4249049     if year==0 
gen control_6_norm = (control6 -   .4134948   )/ .4926731       if year==0
 

* for endline observations 
* mean and sd of outcome varibales for control group (Jharkhand) at endline

summ emo1 emo2 emo3 phy1 phy2 phy3 phy4 phy5 phy6  phy7 sexual1 sexual2 sexual3 control1 control2 control3 control4 control5 control6 if bihar == 0 & year==1

    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
        emo1 |      1,184     .066723    .2496471          0          1
        emo2 |      1,184    .0557432    .2295222          0          1
        emo3 |      1,184    .0565878     .231151          0          1
        phy1 |      1,184    .1148649    .3189935          0          1
        phy2 |      1,184     .222973    .4164162          0          1
-------------+---------------------------------------------------------
        phy3 |      1,184    .0675676    .2511084          0          1
        phy4 |      1,184    .0608108    .2390839          0          1
        phy5 |      1,184    .0244932      .15464          0          1
        phy6 |      1,184    .0236486    .1520161          0          1
        phy7 |      1,184    .1064189    .3085033          0          1
-------------+---------------------------------------------------------
     sexual1 |      1,184    .0388514    .1933222          0          1
     sexual2 |      1,184    .0363176    .1871582          0          1
     sexual3 |      1,184    .0371622     .189239          0          1
    control1 |      1,183     .339814    .4738462          0          1
    control2 |      1,184    .1064189    .3085033          0          1
-------------+---------------------------------------------------------
    control3 |      1,183     .262891    .4403899          0          1
    control4 |      1,183    .1910397    .3932865          0          1
    control5 |      1,184    .2913851    .4545925          0          1
    control6 |      1,184    .3116554    .4633656          0          1



* standardize all outcome variables for all observations at endline (bihar and Jharkhand)

replace emo_1_norm = (emo1 - .066723 )/.2496471      if year==1
replace emo_2_norm = (emo2 -  .0557432   )/ .2295222       if year==1
replace emo_3_norm = (emo3 -   .0565878   )/  .231151       if year==1

replace phy_1_norm = (phy1 -  .1148649  )/ .3189935       if year==1
replace phy_2_norm = (phy2 -   .222973  )/   .4164162     if year==1
replace phy_3_norm = (phy3 - .0675676   )/ .2511084         if year==1
replace phy_4_norm = (phy4 -  .0608108 )/.2390839        if year==1
replace phy_5_norm = (phy5 -   .0244932  )/ .15464         if year==1
replace phy_6_norm = (phy6 -   .0236486   )/ .1520161        if year==1
replace phy_7_norm = (phy7 -   .1064189  )/ .3085033      if year==1

replace sexual_1_norm = (sexual1 -  .0388514  )/  .1933222         if year==1
replace sexual_2_norm = (sexual2 -   .0363176  )/ .1871582       if year==1 
replace sexual_3_norm = (sexual3 -  .0371622   )/   .189239        if year==1 

replace control_1_norm = (control1 -   .339814  )/ .4738462         if year==1
replace control_2_norm = (control2 -   .1064189   )/ .3085033         if year==1 
replace control_3_norm = (control3 -      .262891   )/ .4403899        if year==1
replace control_4_norm = (control4 -    .1910397   )/ .3932865       if year==1
replace control_5_norm = (control5 -   .2913851  )/   .4545925       if year==1 
replace control_6_norm = (control6 -   .3116554   )/ .4633656          if year==1


gen emo_sum = emo_1_norm + emo_2_norm + emo_3_norm
gen phy_sum = phy_1_norm + phy_2_norm + phy_3_norm + phy_4_norm + phy_5_norm + phy_6_norm + phy_7_norm
gen sexual_sum = sexual_1_norm + sexual_2_norm + sexual_3_norm
gen control_sum = control_1_norm + control_2_norm + control_3_norm + control_4_norm + control_5_norm + control_6_norm
gen ipv_sum = emo_1_norm + emo_2_norm + emo_3_norm + phy_1_norm + phy_2_norm + phy_3_norm + phy_4_norm + phy_5_norm + phy_6_norm + phy_7_norm + sexual_1_norm + sexual_2_norm + sexual_3_norm + control_1_norm + control_2_norm + control_3_norm + control_4_norm + control_5_norm + control_6_norm

* we standardize the outcome index again separately for baseline and endline datasets 
* for baseline observations 
* mean and sd of outcome index for control group (Jharkhand) at baseline 

sum emo_sum phy_sum sexual_sum control_sum ipv_sum if bihar == 0 & year == 0 


    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
     emo_sum |      1,158   -1.38e-07    2.390226  -.5458152    16.5783
     phy_sum |      1,158    9.99e-08    4.681017  -1.579315   44.91302
  sexual_sum |      1,158   -1.54e-07    2.445412  -.5099651   18.17305
 control_sum |      1,152    .0074801    3.719866  -3.372637   12.28376
     ipv_sum |      1,152   -.0067657    9.295027  -6.007733   87.86127


* we standardize the index variables for all observations at baseline (bihar and Jharkhand)

gen emo_index_norm     = (emo_sum -(   -1.38e-07  ))/   2.390226  if year==0
gen phy_index_norm     = (phy_sum - (  9.99e-08   ))/  4.681017   if year==0
gen sexual_index_norm  = (sexual_sum - ( -1.54e-07  ))/  2.445412     if year==0
gen control_index_norm = (control_sum - ( .0074801  ))/ 3.719866   if year==0
gen ipv_index_norm     = (ipv_sum - ( -.0067657  ))/  9.295027     if year==0 



	 * for endline observations 
* mean and sd of outcome index for control group (Jharkhand) at endline 
sum emo_sum phy_sum sexual_sum control_sum ipv_sum if bihar == 0 & year == 1



    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
     emo_sum |      1,184    2.71e-07    2.470274  -.7549443   11.93376
     phy_sum |      1,184    3.37e-07    5.071113  -2.077876   27.90974
  sexual_sum |      1,184   -6.05e-07    2.704598  -.5913917   15.20872
 control_sum |      1,182   -.0067512    3.933068  -3.458366   11.06477
     ipv_sum |      1,182   -.0009572    11.09193  -6.882579   66.11699





* we standardize the index variables for all observations at endline (bihar and Jharkhand)

replace emo_index_norm     = (emo_sum -( 2.71e-07))/  2.470274   if year==1
replace phy_index_norm     = (phy_sum - ( 3.37e-07  ))/   5.071113   if year==1
replace sexual_index_norm  = (sexual_sum - (-6.05e-07  ))/   2.704598    if year==1
replace control_index_norm = (control_sum - (-.0067512  ))/ 3.933068    if year==1
replace ipv_index_norm     = (ipv_sum - (  -.0009572 ))/  11.09193     if year==1   


* we have standardized all outcome variables .

* create non linear terms 

* we first create a indicator for bihar pre observations. These observations become our treated group for matching purpose. 
* here we match rest three groups ( post bihar, pre jharkhand and post jharkhand) to pre bihar and create weights.


gen control_year = 1 if year == 0
replace control_year = 0 if year == 1
gen bihar_control = bihar * control_year

* we now match (here bihar_control takes a value of 1 for pre bihar observations)
* we remove base dummies to address dummy trap 

psmatch2 bihar_control resp_age  resp_edu  year_marr  caste  husb_edu poorest poorer middle  richer    religion hhd_gender place_residence hhd_age hh_size  wi1_age wi2_age wi3_age wi4_age   hedu_cas2   hedu_wi1 hedu_wi2 hedu_wi3 hedu_wi4  cas2_rel2    edu_yea rel2_wi1 rel2_wi2 rel2_wi3 rel2_wi4   , kernel common 


diff emo_index_norm [aw=_weight] , p(year)  t(bihar)  cov(resp_age  resp_edu  year_marr  caste  husb_edu poorest poorer middle  richer    religion hhd_gender place_residence hhd_age hh_size  wi1_age wi2_age wi3_age wi4_age   hedu_cas2   hedu_wi1 hedu_wi2 hedu_wi3 hedu_wi4  cas2_rel2    edu_yea rel2_wi1 rel2_wi2 rel2_wi3 rel2_wi4   dist1 dist2 dist3 dist4 dist5 dist6 dist7 dist8 dist9 dist10 dist11 dist12 dist13 dist14 dist15 dist16 dist17 dist18 dist19 dist20 dist21 dist22 dist23 dist24 dist25 dist26 dist27 dist28 dist29 dist30 dist31 dist32 dist33 dist34 dist35 dist36 dist37 dist38 dist39 dist40 dist41 dist42 dist43 dist44 dist45 dist46 dist47 dist48 dist49 dist50 dist51 dist52 dist53 dist54 dist55 dist56 dist57 dist58 dist59 dist60 dist61 dist62 dist62 ) report cluster(distyear) robust











/* Table A12: Sub Sample Analysis: Spousal age gap */

* spousal age gap difference heterogeneity 

* open practise file 

* calculate spousal age gap (husband age - women age)


tab v012
tab v730
gen spouse_agegap = v730-v012
tab spouse_agegap



* we create bihar pre, bihar post, jharkhand pre, jharkhand post subdatasets (remember high variable are defined in each dataset. the code below is for reference)

* we then calculate the median in each dataset
* we assign high =1 for observations with above meadian value 

* open bihar pre 

* calculate the median of spouse age gap 

univar  spouse_agegap

. univar  spouse_agegap
                                        -------------- Quantiles --------------
Variable       n     Mean     S.D.      Min      .25      Mdn      .75      Max
-------------------------------------------------------------------------------
spouse_agegap    3907     4.66     4.12   -29.00     3.00     4.00     6.00    50.00
-------------------------------------------------------------------------------

gen high=1 if spouse_agegap > 4
replace high=0 if high==.
tab high




* bihar post 

univar  spouse_agegap


. univar  spouse_agegap
                                        -------------- Quantiles --------------
Variable       n     Mean     S.D.      Min      .25      Mdn      .75      Max
-------------------------------------------------------------------------------
spouse_agegap    3604     4.89     3.99   -33.00     3.00     4.00     6.00    70.00

gen high=1 if spouse_agegap > 4
replace high=0 if high==.
tab high



* jharkhand pre 


univar  spouse_agegap

. univar  spouse_agegap
                                        -------------- Quantiles --------------
Variable       n     Mean     S.D.      Min      .25      Mdn      .75      Max
-------------------------------------------------------------------------------
spouse_agegap    2524     5.02     3.68   -27.00     3.00     5.00     6.00    26.00
-------------------------------------------------------------------------------

gen high=1 if spouse_agegap >= 5
replace high=0 if high==.
tab high


* jhknd post 

. univar  spouse_agegap
                                        -------------- Quantiles --------------
Variable       n     Mean     S.D.      Min      .25      Mdn      .75      Max
-------------------------------------------------------------------------------
spouse_agegap    2339     4.80     3.63   -25.00     3.00     4.00     6.00    24.00
-------------------------------------------------------------------------------

gen high=1 if spouse_agegap > 4
replace high=0 if high==.
tab high



* high variable created. now append all 4 files 



* save  as appended main file



* we now create above median and below median subsamples (high=1 and high=0)

* open appended main file 



tab high
keep if high==1
tab high

* save as above median 



* open appended main file 

tab high
keep if high==0
tab high

*  save as below median 


* above median and below median are our two subsample datasets 

* open above median sample 

* normalize the outcome variables 

* We standardize outcome variables separately for baseline and endline datasets 
* for baseline observations 
* mean and sd of outcome variables for control group (Jharkhand) at baseline 
summ emo1 emo2 emo3 phy1 phy2 phy3 phy4 phy5 phy6  phy7 sexual1 sexual2 sexual3 control1 control2 control3 control4 control5 control6 if bihar == 0 & year==0 

    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
        emo1 |      1,266    .0418641    .2003578          0          1
        emo2 |      1,266     .042654    .2021557          0          1
        emo3 |      1,266    .0315956      .17499          0          1
        phy1 |      1,266    .0892575    .2852277          0          1
        phy2 |      1,266     .199842    .4000395          0          1
-------------+---------------------------------------------------------
        phy3 |      1,266    .0781991    .2685907          0          1
        phy4 |      1,266    .0545024     .227096          0          1
        phy5 |      1,266    .0110585    .1046174          0          1
        phy6 |      1,266    .0055292    .0741822          0          1
        phy7 |      1,266    .0837283    .2770893          0          1
-------------+---------------------------------------------------------
     sexual1 |      1,266    .0387046    .1929662          0          1
     sexual2 |      1,266     .014218    .1184354          0          1
     sexual3 |      1,266    .0260664    .1593956          0          1
    control1 |      1,265    .2426877    .4288774          0          1
    control2 |      1,266    .0766193    .2660915          0          1
-------------+---------------------------------------------------------
    control3 |      1,264    .3093354    .4624026          0          1
    control4 |      1,264    .1598101    .3665749          0          1
    control5 |      1,264    .2333861    .4231533          0          1
    control6 |      1,264    .4208861    .4938967          0          1



. 
* standardize all outcome variables for all observations at baseline (bihar and Jharkhand)

gen emo_1_norm = (emo1 - .0418641  )/.2003578   if year==0
gen emo_2_norm = (emo2 -  .042654 )/.2021557  if year==0
gen emo_3_norm = (emo3 - .0315956  )/.17499   if year==0

gen phy_1_norm = (phy1 - .0892575  )/.2852277   if year==0
gen phy_2_norm = (phy2 - .199842  )/.4000395   if year==0
gen phy_3_norm = (phy3 - .0781991   )/.2685907   if year==0
gen phy_4_norm = (phy4 -  .0545024 )/.227096  if year==0
gen phy_5_norm = (phy5 - .0110585 )/ .1046174    if year==0
gen phy_6_norm = (phy6 - .0055292 )/ .0741822   if year==0
gen phy_7_norm = (phy7 -  .0837283 )/  .2770893   if year==0

gen sexual_1_norm = (sexual1 -  .0387046   )/.1929662   if year==0
gen sexual_2_norm = (sexual2 -  .014218   )/  .1184354   if year==0 
gen sexual_3_norm = (sexual3 - .0260664 )/.1593956   if year==0  

gen control_1_norm = (control1 - .2426877   )/.4288774  if year==0
gen control_2_norm = (control2 -  .0766193 )/ .2660915   if year==0 
gen control_3_norm = (control3 -  .3093354  )/ .4624026  if year==0
gen control_4_norm = (control4 - .1598101  )/ .3665749  if year==0
gen control_5_norm = (control5 -  .2333861 )/ .4231533   if year==0 
gen control_6_norm = (control6 -  .4208861)/.4938967  if year==0


* for endline observations 
* mean and sd of outcome varibales for control group (Jharkhand) at endline

summ emo1 emo2 emo3 phy1 phy2 phy3 phy4 phy5 phy6  phy7 sexual1 sexual2 sexual3 control1 control2 control3 control4 control5 control6 if bihar == 0 & year==1

    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
        emo1 |      1,060    .0783019    .2687729          0          1
        emo2 |      1,060    .0698113    .2549491          0          1
        emo3 |      1,060    .0707547    .2565357          0          1
        phy1 |      1,060    .1386792    .3457746          0          1
        phy2 |      1,060     .254717    .4359077          0          1
-------------+---------------------------------------------------------
        phy3 |      1,060    .0783019    .2687729          0          1
        phy4 |      1,060    .0716981    .2581092          0          1
        phy5 |      1,060    .0254717    .1576272          0          1
        phy6 |      1,060    .0226415     .148828          0          1
        phy7 |      1,060    .1377358    .3447852          0          1
-------------+---------------------------------------------------------
     sexual1 |      1,060     .045283    .2080223          0          1
     sexual2 |      1,060    .0424528    .2017151          0          1
     sexual3 |      1,060    .0471698     .212102          0          1
    control1 |      1,060    .3320755    .4711802          0          1
    control2 |      1,060    .1084906    .3111458          0          1
-------------+---------------------------------------------------------
    control3 |      1,060    .2707547    .4445594          0          1
    control4 |      1,059    .1813031    .3854512          0          1
    control5 |      1,060    .2867925    .4524773          0          1
    control6 |      1,060    .3264151    .4691225          0          1



. 
* standardize all outcome variables for all observations at endline (bihar and Jharkhand)

replace emo_1_norm = (emo1 - .0783019 )/.2687729   if year==1
replace emo_2_norm = (emo2 -   .0698113   )/.2549491   if year==1
replace emo_3_norm = (emo3 -    .0707547 )/  .2565357      if year==1

replace phy_1_norm = (phy1 -  .1386792 )/ .3457746     if year==1
replace phy_2_norm = (phy2 - .254717 )/ .4359077    if year==1
replace phy_3_norm = (phy3 - .0783019  )/.2687729  if year==1
replace phy_4_norm = (phy4 -  .0716981 )/ .2581092   if year==1
replace phy_5_norm = (phy5 -  .0254717 )/ .1576272     if year==1
replace phy_6_norm = (phy6 - .0226415   )/.148828    if year==1
replace phy_7_norm = (phy7 - .1377358  )/  .3447852     if year==1

replace sexual_1_norm = (sexual1 - .045283 )/ .2080223    if year==1
replace sexual_2_norm = (sexual2 - .0424528  )/ .2017151     if year==1 
replace sexual_3_norm = (sexual3 - .0471698 )/ .212102    if year==1 

replace control_1_norm = (control1 - .3320755  )/.4711802    if year==1
replace control_2_norm = (control2 -   .1084906   )/.3111458    if year==1 
replace control_3_norm = (control3 -   .2707547   )/.4445594     if year==1
replace control_4_norm = (control4 -   .1813031   )/ .3854512     if year==1
replace control_5_norm = (control5 - .2867925 )/  .4524773  if year==1 
replace control_6_norm = (control6 -   .3264151 )/ .4691225    if year==1

gen emo_sum = emo_1_norm + emo_2_norm + emo_3_norm
gen phy_sum = phy_1_norm + phy_2_norm + phy_3_norm + phy_4_norm + phy_5_norm + phy_6_norm + phy_7_norm
gen sexual_sum = sexual_1_norm + sexual_2_norm + sexual_3_norm
gen control_sum = control_1_norm + control_2_norm + control_3_norm + control_4_norm + control_5_norm + control_6_norm
gen ipv_sum = emo_1_norm + emo_2_norm + emo_3_norm + phy_1_norm + phy_2_norm + phy_3_norm + phy_4_norm + phy_5_norm + phy_6_norm + phy_7_norm + sexual_1_norm + sexual_2_norm + sexual_3_norm + control_1_norm + control_2_norm + control_3_norm + control_4_norm + control_5_norm + control_6_norm

* we standardize the outcome index again separately for baseline and endline datasets 
* for baseline observations 
* mean and sd of outcome index for control group (Jharkhand) at baseline 

sum emo_sum phy_sum sexual_sum control_sum ipv_sum if bihar == 0 & year == 0 


    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
     emo_sum |      1,266    1.77e-07    2.394787  -.6004991   15.05187
     phy_sum |      1,266   -4.72e-07    4.833478  -1.826043   38.95415
  sexual_sum |      1,266   -3.22e-07    2.459579  -.4841584   19.41522
 control_sum |      1,262    .0040719    3.659727  -3.362454   12.00582
     ipv_sum |      1,262    .0046185    9.570974  -6.273155   81.81811






* we standardize the index variables for all observations at baseline (bihar and Jharkhand)

gen emo_index_norm     = (emo_sum -( 1.77e-07))/2.394787 if year==0
gen phy_index_norm     = (phy_sum - ( -4.72e-07 ))/ 4.833478  if year==0
gen sexual_index_norm  = (sexual_sum - ( -3.22e-07 ))/2.459579   if year==0
gen control_index_norm = (control_sum - ( .0040719))/ 3.659727  if year==0
gen ipv_index_norm     = (ipv_sum - ( .0046185 ))/  9.570974    if year==0   	


	 
	 * for endline observations 
* mean and sd of outcome index for control group (Jharkhand) at endline 
sum emo_sum phy_sum sexual_sum control_sum ipv_sum if bihar == 0 & year == 1

    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
     emo_sum |      1,060    1.10e-07    2.560261   -.840964   10.70009
     phy_sum |      1,060    2.44e-07     5.03535  -2.267728   26.47694
  sexual_sum |      1,060    3.21e-07    2.747663  -.6505346   13.82884
 control_sum |      1,059   -.0043007    3.953755  -3.462488   11.05925
     ipv_sum |      1,059   -.0007503    11.26443  -7.221714   62.06512






* we standardize the index variables for all observations at endline (bihar and Jharkhand)

replace emo_index_norm     = (emo_sum -(  1.10e-07 ))/ 2.560261 if year==1
replace phy_index_norm     = (phy_sum - ( 2.44e-07 ))/   5.03535  if year==1
replace sexual_index_norm  = (sexual_sum - (3.21e-07    ))/ 2.747663   if year==1
replace control_index_norm = (control_sum - (-.0043007 ))/ 3.953755    if year==1
replace ipv_index_norm     = (ipv_sum - (-.0007503 ))/ 11.26443   if year==1   	 


* we have standardized all outcome variables .

* create non linear terms 

* we first create a indicator for bihar pre observations. These observations become our treated group for matching purpose. 
* here we match rest three groups ( post bihar, pre jharkhand and post jharkhand) to pre bihar and create weights.


gen control_year = 1 if year == 0
replace control_year = 0 if year == 1
gen bihar_control = bihar * control_year

* we now match (here bihar_control takes a value of 1 for pre bihar observations)
* we remove base dummies to address dummy trap 

psmatch2 bihar_control resp_age  resp_edu  year_marr  caste  husb_edu poorest poorer middle  richer    religion hhd_gender place_residence hhd_age hh_size  wi1_age wi2_age wi3_age wi4_age   hedu_cas2   hedu_wi1 hedu_wi2 hedu_wi3 hedu_wi4  cas2_rel2    edu_yea rel2_wi1 rel2_wi2 rel2_wi3 rel2_wi4   , kernel common 


diff emo_index_norm [aw=_weight] , p(year)  t(bihar)  cov(resp_age  resp_edu  year_marr  caste  husb_edu poorest poorer middle  richer    religion hhd_gender place_residence hhd_age hh_size  wi1_age wi2_age wi3_age wi4_age   hedu_cas2   hedu_wi1 hedu_wi2 hedu_wi3 hedu_wi4  cas2_rel2    edu_yea rel2_wi1 rel2_wi2 rel2_wi3 rel2_wi4   dist1 dist2 dist3 dist4 dist5 dist6 dist7 dist8 dist9 dist10 dist11 dist12 dist13 dist14 dist15 dist16 dist17 dist18 dist19 dist20 dist21 dist22 dist23 dist24 dist25 dist26 dist27 dist28 dist29 dist30 dist31 dist32 dist33 dist34 dist35 dist36 dist37 dist38 dist39 dist40 dist41 dist42 dist43 dist44 dist45 dist46 dist47 dist48 dist49 dist50 dist51 dist52 dist53 dist54 dist55 dist56 dist57 dist58 dist59 dist60 dist61 dist62 dist62 ) report cluster(distyear) robust




* below median dataset 

* normalize the outcome variables 

* We standardize outcome variables separately for baseline and endline datasets 
* for baseline observations 
* mean and sd of outcome variables for control group (Jharkhand) at baseline 
summ emo1 emo2 emo3 phy1 phy2 phy3 phy4 phy5 phy6  phy7 sexual1 sexual2 sexual3 control1 control2 control3 control4 control5 control6 if bihar == 0 & year==0 


    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
        emo1 |      1,258    .0405405    .1973017          0          1
        emo2 |      1,258    .0349762    .1837925          0          1
        emo3 |      1,258    .0381558    .1916485          0          1
        phy1 |      1,258    .0842607    .2778889          0          1
        phy2 |      1,258     .209062    .4068005          0          1
-------------+---------------------------------------------------------
        phy3 |      1,258    .0691574    .2538225          0          1
        phy4 |      1,258    .0508744    .2198286          0          1
        phy5 |      1,258    .0111288    .1049461          0          1
        phy6 |      1,258    .0055644    .0744166          0          1
        phy7 |      1,258    .0937997    .2916658          0          1
-------------+---------------------------------------------------------
     sexual1 |      1,258    .0524642      .22305          0          1
     sexual2 |      1,258    .0214626    .1449783          0          1
     sexual3 |      1,258    .0317965    .1755277          0          1
    control1 |      1,257    .2903739    .4541156          0          1
    control2 |      1,257    .0747812    .2631427          0          1
-------------+---------------------------------------------------------
    control3 |      1,257    .3309467    .4707412          0          1
    control4 |      1,256    .1767516    .3816103          0          1
    control5 |      1,257    .2521877    .4344413          0          1
    control6 |      1,255    .3984064    .4897651          0          1



* standardize all outcome variables for all observations at baseline (bihar and Jharkhand)

gen emo_1_norm = (emo1 - .0405405   )/.1973017   if year==0
gen emo_2_norm = (emo2 -  .0349762 )/ .1837925  if year==0
gen emo_3_norm = (emo3 - .0381558  )/ .1916485    if year==0

gen phy_1_norm = (phy1 - .0842607  )/.2778889   if year==0
gen phy_2_norm = (phy2 - .209062  )/ .4068005    if year==0
gen phy_3_norm = (phy3 - .0691574    )/ .2538225   if year==0
gen phy_4_norm = (phy4 -  .0508744  )/.2198286  if year==0
gen phy_5_norm = (phy5 - .0111288 )/ .1049461     if year==0
gen phy_6_norm = (phy6 - .0055644  )/ .0744166     if year==0
gen phy_7_norm = (phy7 -  .0937997 )/ .2916658   if year==0

gen sexual_1_norm = (sexual1 -  .0524642  )/.22305   if year==0
gen sexual_2_norm = (sexual2 -  .0214626   )/ .1449783    if year==0 
gen sexual_3_norm = (sexual3 - .0317965 )/.1755277    if year==0  

gen control_1_norm = (control1 - .2903739   )/ .4541156    if year==0
gen control_2_norm = (control2 -  .0747812 )/ .2631427  if year==0 
gen control_3_norm = (control3 -  .3309467   )/  .4707412    if year==0
gen control_4_norm = (control4 - .1767516 )/ .3816103  if year==0
gen control_5_norm = (control5 -  .2521877 )/ .4344413   if year==0 
gen control_6_norm = (control6 -  .3984064 )/.4897651    if year==0


* for endline observations 
* mean and sd of outcome varibales for control group (Jharkhand) at endline

summ emo1 emo2 emo3 phy1 phy2 phy3 phy4 phy5 phy6  phy7 sexual1 sexual2 sexual3 control1 control2 control3 control4 control5 control6 if bihar == 0 & year==1


    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
        emo1 |      1,279    .0680219     .251882          0          1
        emo2 |      1,279    .0633307    .2436521          0          1
        emo3 |      1,279    .0680219     .251882          0          1
        phy1 |      1,279    .1243159    .3300706          0          1
        phy2 |      1,279    .2462862    .4310157          0          1
-------------+---------------------------------------------------------
        phy3 |      1,279    .0906959    .2872884          0          1
        phy4 |      1,279    .0758405    .2648463          0          1
        phy5 |      1,279     .028147    .1654574          0          1
        phy6 |      1,279     .028147    .1654574          0          1
        phy7 |      1,279     .118061    .3228065          0          1
-------------+---------------------------------------------------------
     sexual1 |      1,279    .0492572    .2164893          0          1
     sexual2 |      1,279      .03362    .1803195          0          1
     sexual3 |      1,279    .0445661    .2064298          0          1
    control1 |      1,278    .3622848    .4808487          0          1
    control2 |      1,279    .1157154    .3200085          0          1
-------------+---------------------------------------------------------
    control3 |      1,278    .2543036    .4356395          0          1
    control4 |      1,278    .1885759    .3913245          0          1
    control5 |      1,279     .308835    .4621937          0          1
    control6 |      1,278    .3137715    .4642064          0          1

. 
. 
* standardize all outcome variables for all observations at endline (bihar and Jharkhand)

replace emo_1_norm = (emo1 - .0680219 )/.251882    if year==1
replace emo_2_norm = (emo2 -   .0633307   )/.2436521   if year==1
replace emo_3_norm = (emo3 -    .0680219  )/ .251882      if year==1

replace phy_1_norm = (phy1 -  .1243159  )/ .3300706      if year==1
replace phy_2_norm = (phy2 - .2462862 )/ .4310157     if year==1
replace phy_3_norm = (phy3 - .0906959  )/.2872884   if year==1
replace phy_4_norm = (phy4 -  .0758405  )/.2648463    if year==1
replace phy_5_norm = (phy5 -  .028147  )/ .1654574     if year==1
replace phy_6_norm = (phy6 -  .028147   )/.1654574     if year==1
replace phy_7_norm = (phy7 - .118061  )/ .3228065     if year==1

replace sexual_1_norm = (sexual1 - .0492572 )/ .2164893     if year==1
replace sexual_2_norm = (sexual2 -  .03362  )/ .1803195      if year==1 
replace sexual_3_norm = (sexual3 - .0445661  )/ .2064298    if year==1 

replace control_1_norm = (control1 - .3622848  )/.4808487     if year==1
replace control_2_norm = (control2 -   .1157154   )/.3200085     if year==1 
replace control_3_norm = (control3 -    .2543036  )/.4356395       if year==1
replace control_4_norm = (control4 -   .1885759    )/ .3913245    if year==1
replace control_5_norm = (control5 - .308835 )/ .4621937   if year==1 
replace control_6_norm = (control6 -   .3137715  )/  .4642064     if year==1

gen emo_sum = emo_1_norm + emo_2_norm + emo_3_norm
gen phy_sum = phy_1_norm + phy_2_norm + phy_3_norm + phy_4_norm + phy_5_norm + phy_6_norm + phy_7_norm
gen sexual_sum = sexual_1_norm + sexual_2_norm + sexual_3_norm
gen control_sum = control_1_norm + control_2_norm + control_3_norm + control_4_norm + control_5_norm + control_6_norm
gen ipv_sum = emo_1_norm + emo_2_norm + emo_3_norm + phy_1_norm + phy_2_norm + phy_3_norm + phy_4_norm + phy_5_norm + phy_6_norm + phy_7_norm + sexual_1_norm + sexual_2_norm + sexual_3_norm + control_1_norm + control_2_norm + control_3_norm + control_4_norm + control_5_norm + control_6_norm


* we standardize the outcome index again separately for baseline and endline datasets 
* for baseline observations 
* mean and sd of outcome index for control group (Jharkhand) at baseline 

sum emo_sum phy_sum sexual_sum control_sum ipv_sum if bihar == 0 & year == 0 


    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
     emo_sum |      1,258   -2.72e-08    2.420472  -.5948699   15.13231
     phy_sum |      1,258   -2.83e-07    4.518659  -1.823443   39.11723
  sexual_sum |      1,258    4.03e-07    2.456276  -.5644009   16.51359
 control_sum |      1,250   -.0014415    3.726277   -3.48377   11.60692
     ipv_sum |      1,250   -.0446896    8.817079  -6.466484   77.94393



* we standardize the index variables for all observations at baseline (bihar and Jharkhand)

gen emo_index_norm     = (emo_sum -( -2.72e-08 ))/2.420472 if year==0
gen phy_index_norm     = (phy_sum - (  -2.83e-07  ))/ 4.518659 if year==0
gen sexual_index_norm  = (sexual_sum - (  4.03e-07  ))/2.456276   if year==0
gen control_index_norm = (control_sum - (-.0014415 ))/ 3.726277  if year==0
gen ipv_index_norm     = (ipv_sum - ( -.0446896  ))/ 8.817079    if year==0  


	 * for endline observations 
* mean and sd of outcome index for control group (Jharkhand) at endline 
sum emo_sum phy_sum sexual_sum control_sum ipv_sum if bihar == 0 & year == 1


    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
     emo_sum |      1,279    6.30e-08    2.540827  -.8000319   11.24441
     phy_sum |      1,279   -4.94e-07     5.19528  -2.256062   25.53583
  sexual_sum |      1,279    1.05e-07    2.668636  -.6298639   14.37928
 control_sum |      1,276   -.0055602    3.859868  -3.524793   10.84849
     ipv_sum |      1,276   -.0011405    11.09877  -7.210751     62.008




* we standardize the index variables for all observations at endline (bihar and Jharkhand)

replace emo_index_norm     = (emo_sum -( 6.30e-08  ))/  2.540827 if year==1
replace phy_index_norm     = (phy_sum - ( -4.94e-07 ))/ 5.19528  if year==1
replace sexual_index_norm  = (sexual_sum - (1.05e-07    ))/ 2.668636   if year==1
replace control_index_norm = (control_sum - (-.0055602  ))/ 3.859868    if year==1
replace ipv_index_norm     = (ipv_sum - ( -.0011405  ))/ 11.09877   if year==1   	 


* we have standardized all outcome variables . 	

* create the non linear terms

* we first create a indicator for bihar pre observations. These observations become our treated group for matching purpose. 
* here we match rest three groups ( post bihar, pre jharkhand and post jharkhand) to pre bihar and create weights.


gen control_year = 1 if year == 0
replace control_year = 0 if year == 1
gen bihar_control = bihar * control_year

* we now match (here bihar_control takes a value of 1 for pre bihar observations)
* we remove base dummies to address dummy trap 

psmatch2 bihar_control resp_age  resp_edu  year_marr  caste  husb_edu poorest poorer middle  richer    religion hhd_gender place_residence hhd_age hh_size  wi1_age wi2_age wi3_age wi4_age   hedu_cas2   hedu_wi1 hedu_wi2 hedu_wi3 hedu_wi4  cas2_rel2    edu_yea rel2_wi1 rel2_wi2 rel2_wi3 rel2_wi4   , kernel common 


diff emo_index_norm [aw=_weight] , p(year)  t(bihar)  cov(resp_age  resp_edu  year_marr  caste  husb_edu poorest poorer middle  richer    religion hhd_gender place_residence hhd_age hh_size  wi1_age wi2_age wi3_age wi4_age   hedu_cas2   hedu_wi1 hedu_wi2 hedu_wi3 hedu_wi4  cas2_rel2    edu_yea rel2_wi1 rel2_wi2 rel2_wi3 rel2_wi4   dist1 dist2 dist3 dist4 dist5 dist6 dist7 dist8 dist9 dist10 dist11 dist12 dist13 dist14 dist15 dist16 dist17 dist18 dist19 dist20 dist21 dist22 dist23 dist24 dist25 dist26 dist27 dist28 dist29 dist30 dist31 dist32 dist33 dist34 dist35 dist36 dist37 dist38 dist39 dist40 dist41 dist42 dist43 dist44 dist45 dist46 dist47 dist48 dist49 dist50 dist51 dist52 dist53 dist54 dist55 dist56 dist57 dist58 dist59 dist60 dist61 dist62 dist62 ) report cluster(distyear) robust


/* Table A13: Pre and Post Covid IPV measures in Jharkhand */ 

 /* create a copy of practise file */ 
 

keep if bihar==0

keep if year==1

gen post_covid=0 if v006==1 & v007==2020

replace post_covid=0 if v006==2 & v007==2020

replace post_covid=0 if v006==3 & v007==2020

replace post_covid=1 if post_covid==.

/* save the file */ 
* we keep dummy outcome variable since our normalized indicators do not add anything. 

summ *_d if post_covid==0
summ *_d if post_covid==1


